2

I have a DataFrame df like this:

   name         Int1        Int2         Int3
0     foob -2.534519e-05 -1.156744e-04 -1.410195e-04
1     arz  2.907239e-04  3.502863e-04  6.410102e-04
2     foo2 -2.140769e-04  4.214626e-04  2.073857e-04
3     bar  3.366116e-03 -6.125303e-04  2.753586e-03
4     rnd -5.014413e-05 -6.740579e-06 -5.688471e-05
5     baz  3.334906e-04 -7.846232e-05  2.550283e-04
6     rnd2 -6.111069e-04  2.194443e-03  1.583336e-03
7     tet  3.184057e-04  2.208398e-04  5.392455e-04

The df should be plotted as a scatter plot depicting three data points (Int1, Int2, Int3) per name.

Currently, I am using seaborn's stripplot function, which works fine if I assign a plot of each single column (e.g. x=name, y=Int1) to the same axis of a figure:

fig, ax = plt.subplots()
seaborn.stripplot(df.name, df.Int1, ax=ax, c='red')
seaborn.stripplot(df.name, df.Int2, ax=ax, c='blue')

However, I want to plot this in a better way for the main reason of having a proper legend and better customization. The solution can also be pandas based.

2
  • 1
    I take it from your example that you want the name to appear on the horizontal axis, and above each one should be three points representing the three values from the row with that name? Do you want the points in each column to be the same color? If not, do you want a consistent color scheme that uses a single color for Int1, a different color for Int2, and another color for Int3? Commented Jan 2, 2017 at 10:14
  • @DavidZ Fully correct. The color scheme should be consistent among a single column (Int1) and different between each column (Int1 vs Int2 vs Int3). E.g. red, green, and blue for Int1, Int2, and Int3, respectively. Thanks. Commented Jan 2, 2017 at 10:17

1 Answer 1

2

Here is my solution. Actually quite simple:

df_melt=pd.melt(df,id_vars=['name'], var_name='intensities', value_name='values')
sns.stripplot(x="name", y="values", data=df_melt, hue='intensities')

This takes the original df and with the help of pandas melt function, produces a new df containing a single intensities column with one row for each of the Int1, Int2, and Int3 values per name. The second line uses seaborn's stripplot to plot df_melt colored by the respective column Int1, Int2, and Int3

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.