So I set up this empty dataframe DF and load data into the dataframe according to some conditions. As such, some its elements would then be empty (nan). I noticed that if I don't specify the datatype as float when I create the empty dataframe, DF.boxplot() will give me an 'Index out of range' error.
As I understand it, pandas' DF.boxplot() uses matplotlib's plt.boxplot() function, so naturally I tried using plt.boxplot(DF.iloc[:,0]) to plot the boxplot of the first column. I noticed a reversed behavior: When dtype of DF is float, it will not work: it will just show me an empty plot. See the code below where DF.boxplot() wont work, but plt.boxplot(DF.iloc[:,0]) will plot a boxplot (when i add dtype='float' when first creating the dataframe, plt.boxplot(DF.iloc[:,0]) will give me an empty plot):
import numpy as np
import pandas as pd
DF=pd.DataFrame(index=range(10),columns=range(4))
for i in range(10):
for j in range(4):
if i==j:
continue
DF.iloc[i,j]=i
I am wondering does this has to do with how plt.boxplot() handles nan for different data types? If so, why did setting the dataframe's data type as 'object' didn't work for DF.boxplot(), if pandas is just using matplotlib's boxplot function?
