I have a pandas dataframe of "factors", floats and integers. I would like to make "R Lattice" like plots on it using conditioning and grouping on the categorical variables. I've used R extensively and wrote custom panel functions to get the plots formatted exactly how I wanted them, but I'm struggling with matplotlib to do the same types of plots succinctly. I am playing around with layouts and subplot2grid, but just cant seem to get it right.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
nRows = 500
df = pd.DataFrame({'c1' : np.random.choice(['A','B','C','D'], size=nRows),
'c2' : np.random.choice(['P','Q','R'], size=nRows),
'i1' : np.random.randint(20,50, nRows),
'i2' : np.random.randint(0,10, nRows),
'x1' : 3 * np.random.randn(nRows) + 90,
'x2' : 2 * np.random.randn(nRows) + 89})
I would like to plot things such as the following (R lattice code examples)
x1 vs. x2 for each level of c1 (lattice code)
xyplot(x1 ~ x2 | c1, data = df)
x1 vs. x2 for each level of c1 with "global" legend c2 (symbols or colors)
xyplot(x1 ~ x2 | c1, groups = c2, data = df)
histograms of x1 for each c2
hist (~x1 | c1, data = df)
I am also trying to make "conditioned" contour plots such as those produced here (1.4.4.4)
https://scipy-lectures.github.io/intro/matplotlib/matplotlib.html
I have read through these examples: http://nbviewer.ipython.org/github/fonnesbeck/Bios366/blob/master/notebooks/Section2_4-Matplotlib.ipynb
However, I would like the layout to be generated from the number of levels in the categorical conditioning (or "by") variable(s). i.e. specify a number of columns, and the rows would be computed based on the number levels.
Appreciate any good advice or steps in the right direction. I'd prefer not use rpy2 or python ggplot (I messed around with them - found them to be frustrating and limiting too).
Thanks! Randall