0
import os
from PIL import Image as PImage
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
from scipy.stats import chisquare


# Read in csv file
# File: https://github.com/mGalarnyk/Python_Tutorials/blob/master/Python_Basics/Linear_Regression/linear.csv
raw_data = pd.read_csv(r"C:\Users\Aidan\Desktop\NEW TASK\Amos_2001_4p2_APD_CONC_Fig2C_OC.csv")

# Removes rows with NaN in them
filtered_data = raw_data[~np.isnan(raw_data["y"])] 


x_y = np.array(filtered_data)

x, y, y_err = x_y[:,0], x_y[:,1], x_y[:,2]


# Reshaping
x, y = x.reshape(-1,1), y.reshape(-1, 1)

# Linear Regression Object 
lin_regression = LinearRegression()

# Fitting linear model to the data
lin_regression.fit(x,y)

# Get slope of fitted line
m = lin_regression.coef_

# Get y-Intercept of the Line
b = lin_regression.intercept_

# Get Predictions for original x values
# you can also get predictions for new data
predictions = lin_regression.predict(x)
chi= chisquare(predictions, y)

# following slope intercept form 
print ("formula: y = {0}x + {1}".format(m, b)) 
print(chi)

# Plot the Original Model (Black) and Predictions (Blue)
plt.scatter(x, y,  color='black')
plt.plot(x, predictions, color='blue',linewidth=3)
plt.errorbar(x, y, yerr=y_err, fmt='o', capsize=4, color='black')
plt.show()

Imported csv data:

1.01214,0.3609367,-0.01214

1.992202,0.341559,0.007798

2.995016,0.3510846,0.004984

3.974359,0.3405953,0.025641

4.976273,0.3612314,0.023727

5.954718,0.3618527,0.045282

6.984058,0.3536173,0.015942

7.962502,0.3542386,0.037498

8.967653,0.3348767,0.032347

9.969748,0.3532908,0.030252

Error:

runfile('C:/Users/Aidan/.spyder-py3/temp.py', wdir='C:/Users/Aidan/.spyder-py3') Traceback (most recent call last):

File "", line 1, in runfile('C:/Users/Aidan/.spyder-py3/temp.py', wdir='C:/Users/Aidan/.spyder-py3')

File "C:\Users\Aidan\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile execfile(filename, namespace)

File "C:\Users\Aidan\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users/Aidan/.spyder-py3/temp.py", line 15, in filtered_data = raw_data[~np.isnan(raw_data["y"])]

File "C:\Users\Aidan\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2685, in getitem return self._getitem_column(key)

File "C:\Users\Aidan\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2692, in _getitem_column return self._get_item_cache(key)

File "C:\Users\Aidan\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2486, in _get_item_cache values = self._data.get(item)

File "C:\Users\Aidan\Anaconda3\lib\site-packages\pandas\core\internals.py", line 4115, in get loc = self.items.get_loc(item)

File "C:\Users\Aidan\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3065, in get_loc return self._engine.get_loc(self._maybe_cast_indexer(key))

File "pandas_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc

File "pandas_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc

File "pandas_libs\hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item

File "pandas_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'y'

So without the 3rd column in the CSV this script executes perfectly. I wanted to include the 3rd column of data for an error bar line. How can I implement the y err bar into my script?

1
  • Using plt.errorbar()? Commented Aug 13, 2018 at 14:20

1 Answer 1

1

Just save your error bars in a variable as follow:

x, y, y_err = x_y[:,0], x_y[:,1], x_y[:,2]

and use plt.errorbar as

plt.errorbar(x, y, yerr=y_err, fmt='o', capsize=4, color='black')

with the following output. You can customize errorbar with further args from this page: https://matplotlib.org/api/_as_gen/matplotlib.pyplot.errorbar.html

enter image description here

Sign up to request clarification or add additional context in comments.

5 Comments

I made those 2 edits in my script but i am getting the following error:
see the above edited script and error in my edited post
The problem seems to be in your file. To plot the figure I posted as an answer, I just copied the data you posted in a csv file and just used the above lines.
Doesnt my script look like the one you used? or was yours very simple
I created a csv file with the data you provided and did not use the following line filtered_data = raw_data[~np.isnan(raw_data["y"])]. I ket everything exactly like you did except I just used filtered_data = raw_data where raw_data is the data from the csv file. It means something is wrong in your line where isnan is being called and hence something in your file. To check that, just make a copy of your file and keep only the first 10 data points without any nan.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.