4

I have a plotly.express.scatter plot with thousands of points. I'd like to add text labels, but only for outliers (eg, far away from a trendline).

How do I do this with plotly?

I'm guessing I need to make a list of points I want labeled and then pass this somehow to plotly (update_layout?). I'm interested in a good way to do this.

Any help appreciated.

1 Answer 1

4

You have the right idea: you'll want to have the coordinates of your outliers, and use Plotly's text annotations to add text labels to these points. I am not sure how you want to determine outliers, but the following is an example using the tips dataset.

import pandas as pd
from sklearn import linear_model
import plotly.express as px

df = px.data.tips()

## use linear model to determine outliers by residual
X = df["total_bill"].values.reshape(-1, 1)
y = df["tip"].values

regr = linear_model.LinearRegression()
regr.fit(X, y)

df["predicted_tip"] = regr.predict(X)
df["residual"] = df["tip"] - df["predicted_tip"]
residual_mean, residual_std = df["residual"].mean(), df["residual"].std()
df["residual_normalized"] = (((df["tip"] - df["predicted_tip"]) - residual_mean) / residual_std).abs()

## determine outliers using whatever method you like
outliers = df.loc[df["residual_normalized"] > 3.0, ["total_bill","tip"]]

fig = px.scatter(df, x="total_bill", y="tip", trendline="ols", trendline_color_override="red")

## add text to outliers using their (x,y) coordinates:
for x,y in outliers.itertuples(index=False):
    fig.add_annotation(
        x=x, y=y,
        text="outlier",
        showarrow=False,
        yshift=10
    )
fig.show()

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you so much! This is great and I didn't know about fig.add_annotation. Your answer inspired me to think of another potential solution - making a str column in my data that equals the label value if it's an outlier, or "" otherwise. And then do px.scatter(..., text="that_new_column"). I'll accept your answer and maybe post an additional answer with this other implementation.
@william_grisaitis you're welcome! and your idea would have the advantage of only needing to call px.scatter

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.