1

I need to render at_risk numbers on a Kaplan Meier graph.

The end result should be similar to this:

enter image description here

The bit I am having trouble rendering is the No. of patients at risk at the bottom of the graph. The values displayed there, correspond to the values on the x-axis. So in essence, it's like a Y-axis rendered in parallel with the X.

I have been trying to replicate multiple-axis found here (https://plot.ly/python/multiple-axes/) without success, and also tried having a subplot and hide everything but the X-axis, but then its values do not align with the graph above.

What is the best approach for this?

1
  • Can you provide the data as well? Commented Jan 18, 2019 at 17:43

2 Answers 2

2

You could plot Kaplan-Meier survival graphs with patients at risk with Plotly by using subplots. The first plot has the survival rate, the second plot is a scatter plot where only the text is shown, i.e. the markers are not shown.

Both plots have the same y-axis and the patients at risk are plotted at the respective x-values.

More examples are here: https://github.com/Ashafix/Kaplan-Meier_Plotly

Example 1 - Lung cancer in female and male patients

import pandas as pd
import lifelines
import plotly
import numpy as np
plotly.offline.init_notebook_mode()

df = pd.read_csv('http://www-eio.upc.edu/~pau/cms/rdata/csv/survival/lung.csv')

fig = plotly.tools.make_subplots(rows=2, cols=1, print_grid=False)
kmfs = []

dict_sex = {1: 'Male', 2: 'Female'}

steps = 5 # the number of time points where number of patients at risk which should be shown

x_min = 0 # min value in x-axis, used to make sure that both plots have the same range
x_max = 0 # max value in x-axis

for sex in df.sex.unique():
    T = df[df.sex == sex]["time"]
    E = df[df.sex == sex]["status"]
    kmf = lifelines.KaplanMeierFitter()

    kmf.fit(T, event_observed=E)
    kmfs.append(kmf)
    x_max = max(x_max, max(kmf.event_table.index))
    x_min = min(x_min, min(kmf.event_table.index))
    fig.append_trace(plotly.graph_objs.Scatter(x=kmf.survival_function_.index,
                                               y=kmf.survival_function_.values.flatten(),  
                                               name=dict_sex[sex]), 
                     1, 1)


for s, sex in enumerate(df.sex.unique()):
    x = []
    kmf = kmfs[s].event_table
    for i in range(0, int(x_max), int(x_max / (steps - 1))):
        x.append(kmf.iloc[np.abs(kmf.index - i).argsort()[0]].name)
    fig.append_trace(plotly.graph_objs.Scatter(x=x, 
                                               y=[dict_sex[sex]] * len(x), 
                                               text=[kmfs[s].event_table[kmfs[s].event_table.index == t].at_risk.values[0] for t in x], 
                                               mode='text', 
                                               showlegend=False), 
                     2, 1)

# just a dummy line used as a spacer/header
t = [''] * len(x)
t[1] = 'Patients at risk'
fig.append_trace(plotly.graph_objs.Scatter(x=x, 
                                           y=[''] * len(x), 
                                           text=t,
                                           mode='text', 
                                           showlegend=False), 
                 2, 1)


# prettier layout
x_axis_range = [x_min - x_max * 0.05, x_max * 1.05]
fig['layout']['xaxis2']['visible'] = False
fig['layout']['xaxis2']['range'] = x_axis_range
fig['layout']['xaxis']['range'] = x_axis_range
fig['layout']['yaxis']['domain'] = [0.4, 1]
fig['layout']['yaxis2']['domain'] = [0.0, 0.3]
fig['layout']['yaxis2']['showgrid'] = False
fig['layout']['yaxis']['showgrid'] = False

plotly.offline.iplot(fig)

enter image description here Example 2 - Colon cancer with different treatments

df = pd.read_csv('http://www-eio.upc.edu/~pau/cms/rdata/csv/survival/colon.csv')

fig = plotly.tools.make_subplots(rows=2, cols=1, print_grid=False)
kmfs = []

steps = 5 # the number of time points where number of patients at risk which should be shown

x_min = 0 # min value in x-axis, used to make sure that both plots have the same range
x_max = 0 # max value in x-axis

for rx in df.rx.unique():
    T = df[df.rx == rx]["time"]
    E = df[df.rx == rx]["status"]
    kmf = lifelines.KaplanMeierFitter()

    kmf.fit(T, event_observed=E)
    kmfs.append(kmf)
    x_max = max(x_max, max(kmf.event_table.index))
    x_min = min(x_min, min(kmf.event_table.index))
    fig.append_trace(plotly.graph_objs.Scatter(x=kmf.survival_function_.index,
                                               y=kmf.survival_function_.values.flatten(),
                                               name=rx), 
                     1, 1)


fig_patients = []
for s, rx in enumerate(df.rx.unique()):
    kmf = kmfs[s].event_table
    x = []
    for i in range(0, int(x_max), int(x_max / (steps - 1))):
        x.append(kmf.iloc[np.abs(kmf.index - i).argsort()[0]].name)
    fig.append_trace(plotly.graph_objs.Scatter(x=x, 
                                               y=[rx] * len(x), 
                                               text=[kmfs[s].event_table[kmfs[s].event_table.index == t].at_risk.values[0] for t in x], 
                                               mode='text', 
                                               showlegend=False), 
                     2, 1)

# just a dummy line used as a spacer/header
t = [''] * len(x)
t[1] = 'Patients at risk'
fig.append_trace(plotly.graph_objs.Scatter(x=x, 
                                           y=[''] * len(x), 
                                           text=t,
                                           mode='text', 
                                           showlegend=False), 
                 2, 1)


# prettier layout
x_axis_range = [x_min - x_max * 0.05, x_max * 1.05]
fig['layout']['xaxis2']['visible'] = False
fig['layout']['xaxis2']['range'] = x_axis_range
fig['layout']['xaxis']['range'] = x_axis_range
fig['layout']['yaxis']['domain'] = [0.4, 1]
fig['layout']['yaxis2']['domain'] = [0.0, 0.3]
fig['layout']['yaxis2']['showgrid'] = False
fig['layout']['yaxis']['showgrid'] = False

plotly.offline.iplot(fig)

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

-1

This is builtin to lifelines as well:

from lifelines import KaplanMeierFitter

ix = waltons['group'] == 'control'

ax = plt.subplot(111)

kmf_control = KaplanMeierFitter()
ax = kmf_control.fit(waltons.loc[ix]['T'], waltons.loc[ix]['E'], label='control').plot(ax=ax)

kmf_exp = KaplanMeierFitter()
ax = kmf_exp.fit(waltons.loc[~ix]['T'], waltons.loc[~ix]['E'], label='exp').plot(ax=ax)


from lifelines.plotting import add_at_risk_counts
add_at_risk_counts(kmf_exp, kmf_control, ax=ax)

https://lifelines.readthedocs.io/en/latest/Examples.html#displaying-multiple-at-risk-counts-below-plots

However, I'm not sure if this works well with plotly

1 Comment

I believe this will do a plot with matplotlib, but I have to use plotly unfortunately. But we are using lifelines already, thats where we get the numbers from.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.