I'm trying to plot the original data before handling the imbalance in a way to show the class distribution and class imbalance (class is Failure =0/1) 2. I might need to do some transformation on the data in both cases to be able to visualize it.
Here's what the column looks like:
| failure |
|---------|
| 1 |
| 0 |
| 0 |
| 1 |
| 0 |
Here's what I have tried so far:
import numpy as np
from scipy.stats.kde import gaussian_kde
def distribution_scatter(x, symmetric=True, cmap=None, size=None):
pdf = gaussian_kde(x)
w = np.random.rand(len(x))
if symmetric:
w = w*2-1
pseudo_y = pdf(x) * w
if cmap:
plt.scatter(x, pseudo_y, c=x, cmap=cmap, s=size)
else:
plt.scatter(x, pseudo_y, s=size)
return pseudo_y
Results:
The problem with the results:
I want the plot the distribution of 0's and 1's. For which I believe I need to transform it in someway.




