2

I am trying to plot T-SNE reduced vectors with Seaborn. I have the following code:

import pandas as pd 
import numpy as np
import seaborn as sns
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

tsne = TSNE(n_components=2, verbose=1, perplexity=40, n_iter=300)
tsne_results = tsne.fit_transform(final_data)

df_subset = pd.DataFrame(columns = ['tsne-2d-one', 'tsne-2d-two']) 
df_subset['tsne-2d-one'] = tsne_results[:,0]
df_subset['tsne-2d-two'] = tsne_results[:,1]

plt.figure(figsize=(16,10))

sns.scatterplot(
    x="tsne-2d-one", y="tsne-2d-two",
    hue="y",
    palette=sns.color_palette("hls", 10),
    data=df_subset,
    legend="full")

As you can see from above code, it seems that the scatterplot from seaborn lib requires a Panda.DataFrame input, so basically I am initializing it empty this way

df_subset = pd.DataFrame(columns = ['tsne-2d-one', 'tsne-2d-two']) 

Then, I basically assign columns of this dataframe to each TSNE dimension

df_subset['tsne-2d-one'] = tsne_results[:,0]
df_subset['tsne-2d-two'] = tsne_results[:,1]

I can print these values without any problem.

However, when I run the code, here is what I get:

File "balance-training.py", line 59, in <module>
    legend="full")
  File "/home/server/.local/lib/python3.6/site-packages/seaborn/relational.py", line 1335, in scatterplot
    alpha=alpha, x_jitter=x_jitter, y_jitter=y_jitter, legend=legend,
  File "/home/server/.local/lib/python3.6/site-packages/seaborn/relational.py", line 852, in __init__
    x, y, hue, size, style, units, data
  File "/home/server/.local/lib/python3.6/site-packages/seaborn/relational.py", line 142, in establish_variables
    raise ValueError(err)
ValueError: Could not interpret input 'y'

What am I missing here?

1 Answer 1

2

There is no column y, so you can remove hue="y":

sns.scatterplot(
    x="tsne-2d-one", y="tsne-2d-two",
    palette=sns.color_palette("hls", 10),
    data=df_subset,
    legend="full")

I think here is possible pass both vectors to x and y parameters and omit data parameter:

sns.scatterplot(
    x=tsne_results[:,0], y=tsne_results[:,1]
    palette=sns.color_palette("hls", 10),
    legend="full")

Sample:

tsne_results = np.array([[1,2],[4,5],[7,1]])
print (tsne_results)
[[1 2]
 [4 5]
 [7 1]]

sns.scatterplot(
    x=tsne_results[:,0], y=tsne_results[:,1],
    palette=sns.color_palette("hls", 10),
    legend="full")

g

Sign up to request clarification or add additional context in comments.

1 Comment

Rookie mistake :-( Thank you

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.