2

I have a dataframe with x1 and x2 columns. I want to plot each row as an unidimensional line where x1 is the start and x2 is the end. Follows I have my solution which is not very cool. Besides it is slow when plotting 900 lines in the same plot.

Create some example data:

import numpy as np
import pandas as pd    
df_lines = pd.DataFrame({'x1': np.linspace(1,50,50)*2, 'x2': np.linspace(1,50,50)*2+1})

My solution:

import matplotlib.pyplot as plt
def plot(dataframe):
    plt.figure()
    for item in dataframe.iterrows():
        x1 = int(item[1]['x1'])
        x2 = int(item[1]['x2'])
        plt.hlines(0,x1,x2)

plot(df_lines)

It actually works but I think it could be improved. Thanks in advance.

3 Answers 3

4

You can use DataFrame.apply with axis=1 for process by rows:

def plot(dataframe):
    plt.figure()
    dataframe.apply(lambda x: plt.hlines(0,x['x1'],x['x2']), axis=1)

plot(df_lines)
Sign up to request clarification or add additional context in comments.

Comments

3

Matplotlib can save a lot of time drawing lines, when they are organized in a LineCollection. Instead of drawing 50 individual hlines, like the other answers do, you create one single object.

Such a LineCollection requires an array of the line vertices as input, it needs to be of shape (number of lines, points per line, 2). So in this case (50,2,2).

import numpy as np
import pandas as pd    
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection

df_lines = pd.DataFrame({'x1': np.linspace(1,50,50)*2, 
                         'x2': np.linspace(1,50,50)*2+1})

segs = np.zeros((len(df_lines), 2,2))
segs[:,:,0] = df_lines[["x1","x2"]].values


fig, ax = plt.subplots()

line_segments = LineCollection(segs)
ax.add_collection(line_segments)

ax.set_xlim(0,102)
ax.set_ylim(-1,1)
plt.show()

enter image description here

1 Comment

Best optimization so far, when tested with 500 hundred lines you start seeing that you gain a lot of performance with it. Nice one :)
2

I add to the nice @jezrael response the possibility to do this in the numpy framework using numpy.apply_along_axis. Performance-wise it is equivalent to DataFrame.apply:

def plot(dataframe):
    plt.figure()
    np.apply_along_axis(lambda x: plt.hlines(0,x[0],x[1]), 1,dataframe.values)
    plt.show()

plot(df_lines)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.