0

I am trying to better understand ggplot2, so while I am looking for a way to accomplish the task below, I would also appreciate an explanation of why it does not currently work. So far I could not find information on the topic.

Both of my questions are about using expressions inside ggplot2.

I have a data.frame

    set.seed(1)
    DF <- data.frame(A = 1:24, B = LETTERS[rep(1:4,6)], C = rep(1:3,8))

    head(DF, n = 9)

    #  A B C
    #1 1 A 1
    #2 2 B 2
    #3 3 C 3
    #4 4 D 1
    #5 5 A 2
    #6 6 B 3
    #7 7 C 1
    #8 8 D 2
    #9 9 A 3

I want to plot the mean value of the column A, grouped by the values in B without transforming my data. I would expect that it is possible to do something like the following:

ggplot(DF) + geom_point(aes(x = B , y = mean(A), group = B))

but that returns the following ggplot2 plots universal mean, not grouped mean where mean(A) is the same for all values of B.

How could I go about plotting this without transforming my data?

Another barrier which I find myself up against from time to time is trying to put an expression inside a facet_grid() or facet_wrap()

For example, say I want to use modular division to make a new temporary column like so to facet by later:

DF$A %% 4
1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0

I could tack this column onto my data frame. But let's impose a restriction that I cannot transform my data. I would have expected that I could do something like this:

ggplot(DF)+geom_point(aes(x = B, y = C)) + facet_grid({A %% 4}~.)

or

ggplot(DF)+geom_point(aes(x = B, y = C, group = A)) + facet_grid({A %% 4} ~ .)

or even

ggplot(DF)+geom_point(aes(x = B, y = C)) + facet_grid(formula({A %% 4} ~.))

but they all return the error

Error in layout_base(data, rows, drop = drop) : 
  At least one layer must contain all variables used for facetting

Could anyone explain to me in a way that reveals the way that ggplot2 works why these attempts fail and how I might get the desired results without transforming the data?

1 Answer 1

1

Why does your plot only have one y value? Because mean(DF$A) only produces one value.

If you want to do a transformation, you'll have to use a stat_* function. That is exactly what they are supposed to do.

In this case:

ggplot(DF, aes(x = B , y = A, group = B)) + 
  stat_summary(fun.y = 'mean', geom = 'point')

Or the equivalent:

ggplot(DF, aes(x = B , y = A, group = B)) + 
  geom_point(stat = 'summary', fun.y = 'mean')

I don't see a way to do facetting on non-existing columns.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.