6

I am having a tough time drawing the Plotly 3d surface plot. I have a big data frame of 4000 rows and three columns. I did ask questions here and got some answers. When I try them, it takes hours for the code to run yet I see no plot. I want to confirm what I am doing is right. Because I am new to the surface plots.

My code:

import plotly.graph_objects as go
import plotly.graph_objs
import plotly
df = 
index     x           y           z
0        10.2        40.5        70.5            
1        30.5        30.2       570.5
.
.
4000     100.5       201.5      470.5

df['z']= [df['z'].tolist for x in df.index]
df = 
index     x           y           z
0        10.2        40.5       [70.5,570.5,..,470.5]            
1        30.5        30.2       [70.5,570.5,..,470.5]
.
.
4000     100.5       201.5      [70.5,570.5,..,470.5]

    zdata = [df['z'].tolist()]*len(df)
    plotly.offline.plot({"data":[go.Surface(x=df['x'].values,
                                            y=df['y'].values,
                                            z = df['z'].values)],
    "layout":plotly.graph_objs.Layout(title='Some data', autosize=False,
                      width=600, height=600,
                    scene = dict(xaxis_title='x',
                    yaxis_title='y',
                    zaxis_title='z'),
                      margin=dict(l=10, r=10, b=10, t=10))})

I would be grateful to have somebody clarify me that what I am doing to generate a surface plot is correct?

2 Answers 2

4

Here is a simple / stripped down example of a 3D surface plot to hopefully get you going.

The key message here is: Don't over complicate it. This same logic should be fine on a DataFrame with 4000+ rows. (Granted, it'll plot ~16M data points, so it'll take a bit of time).

The key point to remember is that z must be a 2d array of the shape [x.shape[0], y.shape[0]]. Essentially meaning, if x and y are of length 10, then z must be of the shape: [10, 10].

As I don't have your complete dataset, I've synthesised the data - hope that's OK for illustration purposes. Additionally, I've stuck with numpy for simplicity, keeping in mind that a numpy array is essentially a DataFrame column.

Simple example:

import numpy as np
from plotly.offline import plot

n = 10
x = np.arange(n)
y = x
z = np.tile(x**2, [n, 1])

data = [{'x': x,
         'y': y,
         'z': z,
         'type': 'surface'}]

plot({'data': data}, filename='/path/to/graph.html')

Output:

enter image description here

Something a little more fun:

n = 360
x = np.arange(n)
y = x
v = np.tile([np.sin(i*(np.pi/180)) for i in range(n)], [n, 1]).T
z = (v.T[0]*v)

data = [{'x': x,
         'y': y,
         'z': z,
         'type': 'surface'}]

plot({'data': data}, filename='/path/to/graph.html')

You'll note the plotting logic is identical.

Output:

enter image description here

Sign up to request clarification or add additional context in comments.

7 Comments

How do I conver my data in z column to be of 2-D array of required shape? Is this same as what I have done already? or something different? Can you explain from this perspective?
If all arrays are of the same length: A simple way would be: z = np.tile(x, [x.shape[0], 1]). This will 'tile' (or replicate) the x array into a shape of [10, 10] if x is of the length 10. This is the method I used in the example. However, typically with a 3D surface plot, you'll have x*y (in terms of array length) of z data points. If z is of the shape x*y: Use np.reshape([x.shape[0], y.shape[0]]) to reshape the z array, using the length of x and y.
It is still confusing to me. The way I modified column z in my question, is it correct or wrong? I mean, I modified it to be x and y size? Right
Note that using to_numpy() rather than tolist() should be faster for you. (Just ran on 1000 records and was ~6x faster).
As simple as: df['z'] = [df['z'].to_numpy() for _ in df.index]
|
0

When you use go.Surface, z should be a 2-dimensional matrix, and x and y should be unique values of x_axis and y_axis. This code prepares DataFrame values for using go.Surface.

x_data = df['x'].values
y_data = df['y'].values
z_data = df['z'].values

x = np.unique(x_data)
y = np.unique(y_data)

# Set default value of np.Nan for z matrix
z = np.empty((x.shape[0], y.shape[0]))
z[:] = np.NaN
for i in range(len(z_data)):
    z[np.where(x, np.isclose(x, x_data[i])), 
      np.where(y, np.isclose(y, y_data[i]))] = z_data[i]
z=z.transpose()

fig = go.Figure(data=[go.Surface(z=z, x=x, y=y)])
fig.show()

3 Comments

in z_data[I], it seems like a typo - should be lowercase 'i'
Anyway, it seems to to work anyway, it fails in the for-loop, where x_data[i] - x_min is not an integer.
I've updated the example to address the issue that @NoamG was having, since I was having that problem as well. The revised solution uses np.where(x, np.isclose(x, x_data[i])) to find indices within the arrays returned by np.unique. This should work for both floats and ints

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.