3

I have a multi index dataframe, with the two indices being Sample and Lithology

 Sample        20EC-P     20EC-8  20EC-10-1  ...     20EC-43    20EC-45    20EC-54
Lithology         Pd     Di-Grd         Gb  ... Hbl Plag Pd     Di-Grd         Gb
Rb          7.401575  39.055118   6.456693  ...    0.629921  56.535433  11.653543
Ba         24.610102  43.067678  10.716841  ...    1.073115  58.520532  56.946630
Th          3.176471  19.647059   3.647059  ...    0.823529  29.647059   5.294118

I am trying to put it into a seaborn lineplot as such.

spider = sns.lineplot(data = data, hue = data.columns.get_level_values("Lithology"),
                      style = data.columns.get_level_values("Sample"),
                      dashes = False, palette = "deep")

The lineplot comes out as

1

I have two issues. First, I want to format hues by lithology and style by sample. Outside of the lineplot function, I can successfully access sample and lithology using data.columns.get_level_values, but in the lineplot they don't seem to do anything and I haven't figured out another way to access these values. Also, the lineplot reorganizes the x-axis by alphabetical order. I want to force it to keep the same order as the dataframe, but I don't see any way to do this in the documentation.

1 Answer 1

2

To use hue= and style=, seaborn prefers it's dataframes in long form. pd.melt() will combine all columns and create new columns with the old column names, and a column for the values. The index too needs to be converted to a regular column (with .reset_index()).

Most seaborn functions use order= to set an order on the x-values, but with lineplot the only way is to make the column categorical applying a fixed order.

from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

column_tuples = [('20EC-P', 'Pd '), ('20EC-8', 'Di-Grd'), ('20EC-10-1 ', 'Gb'),
                 ('20EC-43', 'Hbl Plag Pd'), ('20EC-45', 'Di-Grd'), ('20EC-54', 'Gb')]
col_index = pd.MultiIndex.from_tuples(column_tuples, names=["Sample", "Lithology"])
data = pd.DataFrame(np.random.uniform(0, 50, size=(3, len(col_index))), columns=col_index, index=['Rb', 'Ba', 'Th'])

data_long = data.melt(ignore_index=False).reset_index()
data_long['index'] = pd.Categorical(data_long['index'], data.index) # make categorical, use order of the original dataframe
ax = sns.lineplot(data=data_long, x='index', y='value',
                  hue="Lithology", style="Sample", dashes=False, markers=True, palette="deep")
ax.set_xlabel('')

ax.legend(loc='upper left', bbox_to_anchor=(1.01, 1.02))
plt.tight_layout()  # fit legend and labels into the figure
plt.show()

sns.lineplot from multiindexed columns

The long dataframe looks like:

   index      Sample    Lithology      value
0     Rb      20EC-P          Pd    6.135005
1     Ba      20EC-P          Pd    6.924961
2     Th      20EC-P          Pd   44.270570
...
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for your help. I'm running into one more error, however. The line where you applied the melt returns the error "TypeError: melt() got an unexpected keyword argument 'ignore_index'". The error message just leads to that line, and doesn't go anywhere else. That's the only error, if I run it without ignore_index it returns the dataframe as it should, but with nan instead of element names for the index.
You really need to upgrade your pandas version.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.