1

From the Titanic Dataset from Kaggle, I'm trying to create a boxplot using the package matplotlib.pyplot using the age parameter as my basis. So I used the following code:

plt.boxplot(dataframe.age)
plt.show()

Only to get a blank graph:

enter image description here

Is there anything I'm doing wrong? Please let me know.

A lot of people on the internet use the seaborn package to do this. I accept using seaborn is an easier method, but currently, I am learning specifically about matplotlib.pyplot (I'm still a newbie to this), so I would need my code to be accordingly.

Thanks for understanding and for your help in advance!

3
  • My guess is it's an issue with the dataset. Have you looked at the dataframe to make sure you're using a proper column name? A quick view at the website shows the name may be "Age" not "age"? Commented Sep 18, 2021 at 18:03
  • @ramzikai The dataset I actually got actually has all the headers in lowercase. So that wasn't the problem. The link was just a reference. Commented Sep 18, 2021 at 18:08
  • seaborn is just a high-level API for matplotlib Commented Sep 18, 2021 at 19:34

1 Answer 1

0

There are a lot of missing values in the Age column of the Titanic data set. Either remove those rows, or fill the values with a default before creating a boxplot.

plt.boxplot(data['Age'].fillna(0.0))

Titanic Age boxplot

Sign up to request clarification or add additional context in comments.

1 Comment

Ohh ok I get it. I remember trying to fill the NaN values to no avail, but your answer has helped me. Thanks :)