1

I'm using Rpy2 to plot dataframes with ggplot2. I make the following plot:

p = ggplot2.ggplot(iris) + \
    ggplot2.geom_point(ggplot2.aes_string(x="Sepal.Length", y="Sepal.Width")) + \
    ggplot2.facet_wrap(Formula("~Species"))
p.plot()
r["dev.off"]()  

I'd like to annotate each subplot with some statistics about the plot. For example, I'd like to compute the correlation between each x/y subplot and place it on the top right corner of the plot. How can this be done? Ideally I'd like to convert the dataframe from R to a Python object, compute the correlations and then project them onto the scatters. The following conversion does not work, but this is how I'm trying to do it:

# This does not work 
#iris_df = pandas.DataFrame({"Sepal.Length": rpy2.robjects.default_ri2py(iris.rx("Sepal.Length")),
#                            "Sepal.Width": rpy2.robjects.default_ri2py(iris.rx("Sepal.Width")),
#                            "Species": rpy2.robjects.default_ri2py(iris.rx("Species"))})
# So we access iris using R to compute the correlation
x = iris_py.rx("Sepal.Length")
y = iris_py.rx("Sepal.Width")
# compute r.cor(x, y) and divide up by Species
# Assume we get a vector of length Species saying what the
# correlation is for each Species' Petal Length/Width
p = ggplot2.ggplot(iris) + \
    ggplot2.geom_point(ggplot2.aes_string(x="Sepal.Length", y="Sepal.Width")) + \
    ggplot2.facet_wrap(Formula("~Species")) + \
    # ...
    # How to project correlation?
p.plot()
r["dev.off"]()    

But assuming I could actually access the R dataframe from Python, how could I plot these correlations? thanks.

2 Answers 2

1

The solution is to create a dataframe with a label for each sample plotted. The dataframe's column should match the corresponding column name of the dataframe with the original data. Then this can be plotted with:

p += ggplot2.geom_text(data=labels_df, mapping=ggplot2.aes_string(x="1", y="1", mapping="labels"))

where labels_df is the dataframe containing the labels and labels is the column name of labels_df with the labels to be plotted. (1,1) in this case will be the coordinate position of the label in each subplot.

Sign up to request clarification or add additional context in comments.

Comments

0

I found that @user248237dfsf's answer didn't work for me. ggplot got confused between the data frame I was plotting and the data frame I was using for labels.

Instead, I used ggplot2_env = robjects.baseenv'as.environment'

class GBaseObject(robjects.RObject):
  @classmethod
  def new(*args, **kwargs):
    args_list = list(args)
    cls = args_list.pop(0)
    res = cls(cls._constructor(*args_list, **kwargs))
    return res

class Annotate(GBaseObject):
  _constructor = ggplot2_env['annotate']
annotate = Annotate.new

Now, I have something that works just like the standard annotate.

annotate(geom = "text", x = 1, y = 1, label = "MPC")

One minor comment: I don't know if this will work with faceting.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.