1

I currently have the following code:

import pandas as pd

df = pd.DataFrame({'x1': [1,7,15], 'x2': [5,10,20]})
df

enter image description here

import plotly.graph_objects as go
fig = go.Figure()
for row in df.iterrows():
    row_data = row[1]
    fig.add_trace(go.Scatter(x=[row_data['x1'], row_data['x2']], y=[0,0], mode='lines',
                            line={'color': 'black'}))
fig.update_layout(showlegend=False)
fig.show()

enter image description here

This produces the required result. However, if I have 30k traces, things start to get pretty slow, both when rendering and when working with the plot (zooming, panning). So I'm trying to figure out a better way to do it. I thought of using shapes, but then I loos some functionalities that only traces have (e.g. hover information), and also not sure it'll be faster. Is there some other way to produce fragmented (non-overlapping) lines within one trace?
Thanks!

Update:
Based on the accepted answer by @Mangochutney, here is how I was able to produce the same plot using a single trace:

import numpy as np
import plotly.graph_objects as go

x = [1, 5, np.nan, 7, 10, np.nan, 15, 20]
y = [0, 0, np.nan, 0, 0, np.nan, 0, 0]
fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=y, mode='lines'))
fig.update_layout(showlegend=True)
fig.show()

enter image description here

3
  • have you considered making the lines in matplotlib and then taking the picture and using it as background in plotly. You are loosing the hover information but the render time will be super fast. Commented Jun 4, 2022 at 11:20
  • Do the lines overlap? What is the maximum length between x1 and x2? What is the Domain (max and min value) of x1 and x2? Commented Jun 4, 2022 at 15:12
  • In my real data, lines do not overlap. Lengths between x1 and x2 range between ~100-150,000. The domain is 0-675M. Commented Jun 5, 2022 at 11:42

2 Answers 2

1
+100

By default you can introduce gaps in your go.scatter line plot by adding additional np.nan entries where you need them. This behavior is controlled by the connectgaps parameter: see docs

E.g.: go.Scatter(x=[0,1,np.nan, 2, 3], y=[0,0,np.nan,0,0], mode='lines') should create a line segement between 0 and 1 and 2 and 3.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! This works great. See an example in my original question.
BTW, this also works with python None.
0

You need first to find the overlapping lines. Then you can reduce the size of the data frame drastically. First, let us define a sample data frame like yours:

import numpy as np
import pandas as pd
import plotly.graph_objects as go

x_upperbound = 100_000
data = {'x1': [], 'x2': []}
for i in range(30_000):
    start = np.random.randint(1, x_upperbound-10)
    end = np.random.randint(start, start+4)
    data['x1'].append(start)
    data['x2'].append(end)
    
df = pd.DataFrame(data)

Then using the following code, we can find a reduced (by one third) but an equivalent version of our original data frame introduced above:

l = np.zeros(x_upperbound+2)
for i, row in enumerate(df.iterrows()):
    l[row[1]['x1']] += 1
    l[row[1]['x2']+1] -= 1
cumsum = np.cumsum(l)
new_data = {'x1': [], 'x2': []}
flag = False
for i in range(len(cumsum)):
    if cumsum[i]:
        if flag:
            continue
        new_data['x1'].append(i)
        flag = True
    else:
        if flag:
            new_data['x2'].append(i-1)
            flag = False
optimized_df = pd.DataFrame(new_data)

And now is show time. Using this code, you can show the exact result you would have gotten if you had graphed the original data frame:

fig = go.Figure()
for row in optimized_df.iterrows():
    row_data = row[1]
    fig.add_trace(go.Scatter(x=[row_data['x1'], row_data['x2']], y=[0,0], mode='lines',
                            line={'color': 'black'}))
fig.update_layout(showlegend=False)
fig.show()

It takes more time if either the distance between any x1 and its respective x2 decreases or their domain expands further.

2 Comments

Thanks. Looks like a clever way to reduce the data, but overlaps are actually not a problem in my data, so I don't expect much effect here. I'll indicate this in the question to avoid further confusion.
Do you mean that your traces do not overlap? Would you please also answer my question in the comment section of your question?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.