I'm trying to plot data read into Pandas from a xlsx file. After some minor formatting and data quality checks, I try to plot using matplotlib but get the following error:
TypeError: Empty 'DataFrame': no numeric data to plot
This is not a new issue and I have followed many of the pages on this site dealing with this very problem. The posted suggestions, unfortunately, have not worked for me.
My data set includes strings (locations of sampling sites and limited to the first column), dates (which I have converted to the correct format using pd.to_datetime), many NaN entries (that cannot be converted to zeros due to the graphical analysis we are doing), and column headings representing various analytical parameters.
As per some of the suggestions I read on this site, I have tried the following code
df = df.astype(float)which gives me the following errorValueError: could not convert string to float: 'Site 1'(Site 1 is a sampling location)df = df.apply(pd.to_numeric, errors='ignore')which gives me the following:dtypes: float64(13), int64(1), object(65)and therefore does not appear to work as most of the data remains as an object. The date entries are the int64 and I cannot figure out why some of the data columns are float64 and some remain as objectsdf = df.apply(pd.to_numeric, errors='coerce')which deletes the entire DataFrame, possibly because this operation fills the entire DataFrame withNaN?
I'm stuck and would appreciate any insight.
EDIT
I was able to solve my own question based on some of the feedback. Here is what worked for me:
df = "path"
header = [0] # keep column headings as first row of original data
skip = [1] # skip second row, which has units of measure
na_val = ['.','-.','-+0.01'] # Convert spurious decimal points that have
# no number associated with them to NaN
convert = {col: float for col in (4,...,80)} # Convert specific rows to
# float from original text
parse_col = ("A","C","E:CC") # apply to specific columns
df = pd.read_excel(df, header = header, skiprows = skip,
na_values = na_val, converters = convert, parse_columns = parse_col)