So, I am not really a programmer, but I need to do figure out a relationship on an equation of two variables, I have been googling extensively, but I can't figure out how to input my data into sklearn linear_model.
I have a dataframe defined thus
I = [-2, 0, 5, 10, 15, 20, 25, 30]
d = {27.11 : [9.01,8.555,7.56,6.77,6.14,5.63,5.17,4.74],
28.91 : [8.89,8.43,7.46,6.69,6.07,5.56,5.12,4.68],
30.72 : [8.76,8.32,7.36,6.60,6.00,5.50,5.06,4.69],
32.52 : [8.64,8.20,7.26,6.52,5.93,5.44,5.00,4.58],
34.33 : [8.52,8.08,7.16,6.44,5.86,5.38,4.95,4.52],
36.11 : [8.39,7.97,7.07,6.35,5.79,5.31,4.86,4.46]}
oxy = pd.DataFrame(index = I, data = d) # temp, salinity to oxygenation ml/L
With the indices representing temperature, and the column names representing salinity, and I need to come up with a way to predict an oxygenation (the values in the columns) from temperature and salinity.
I think my issue is mostly syntax related,
I have tried fitting my data by
X = [list(oxy.columns.values),list(oxy.index.values)]
regr = linear_model.LinearRegression()
regr.fit(X,oxy)
along with lots variants trying to get the values at index,column in the datatable to be associated with each X. I am really just not figuring out how to do this.
I found lots of guides on two variables, but they all had flat datasets, and I don't know how to flatten this without lots and lots of typing.
So my question is, either, is there a way to do a regression on two varibles with my independent varibles being my index and column values on a pandas datatable, and or, is there a quick and efficient way to flatten this datatable into a 48 by 3 datatable, so that one of the many guides I've found will actually help me?
Thank you in advanced.