Slow matplotlib Plotting

Question

I have MultiIndexed pandas Series and am trying to plot each index in its own subplot, but it is running very slowly.

To accomplish the subplotting I am using a for loop over the outer level of MultiIndex, and plotting the Series using the inner index level as the x coordinate.

def plot_series( data ):
    # create 16 subplots, corresponding to the 16 outer index levels
    fig, axs = plt.subplots( 4, 4 )

    for oi in data.index.get_level_values( 'outer_index' ):
        # calculate subplot to use
        row = int( oi/ 4 )
        col = int( oi - row* 4 )

        ax = axs[ row, col ]
        data.xs( oi ).plot( use_index = True, ax = ax )

    plt.show()

Each outer index level has 1000 data points, but the plotting takes several minutes to complete.

Is there a way to speed up the plotting?

Data

num_out = 16
num_in  = 1000

data = pd.Series( 
    data = np.random.rand( num_out* num_in ), 
    index = pd.MultiIndex.from_product( [ np.arange( num_out ), np.arange( num_in ) ], names = [ 'outer_index', 'inner_index' ] ) 
)

Hey, could you ad an example of your data so that we could actually run the code? There also seems to be an error, with a variable not being defined index = — Mitchell van Zuylen
– Mitchell van Zuylen, Commented Mar 7, 2019 at 11:01
Sorry about the errors, and lacking data. I was trying to post this before a meeting and didn't make all the corrections in my rush. Thanks for the help :) — bicarlsen
– bicarlsen, Commented Mar 7, 2019 at 13:42

dubbbdan · Accepted Answer · 2019-03-07 17:00:59Z

2

Rather than loop through data.index.get_level_values( 'outer_index' ), you could use data.groupby(level='outer_index') and iterate through the grouped object using:

for name, group in grouped:
   #do stuff

This removes the bottleneck that slicing the data frame using data.xs( oi ) creates.

def plot_series(data):
   grouped = data.groupby(level='outer_index')

   fig, axs = plt.subplots( 4, 4 )
   for name, group in grouped:
      row = int( name/ 4 )
      col = int( name - row* 4 )
      ax = axs[ row, col ]
      group.plot( use_index = True, ax = ax )

      plt.show()



num_out = 16
num_in  = 1000

data = pd.Series( 
    data = np.random.rand( num_out* num_in ), 
    index = pd.MultiIndex.from_product( [ np.arange( num_out ), np.arange( num_in ) ], names = [ 'outer_index', 'inner_index' ] ) 
)

plot_series(data)

using timeit you can see this approach is much faster:

%timeit plot_series(data)
795 ms ± 252 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

edited Mar 7, 2019 at 17:00

answered Mar 7, 2019 at 16:45

dubbbdan

2,7503 gold badges29 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

bicarlsen Over a year ago

That worked beautifully. I'm surprised it was the slicing that was the bottleneck. What causes it to be so slow?

dubbbdan Over a year ago

I think it is because groupby splits the data frame into groups using a mapper.

Collectives™ on Stack Overflow

Slow matplotlib Plotting

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related