Second Derivative in Python - scipy/numpy/pandas

Question

I'm trying to take a second derivative in python with two numpy arrays of data.

For example, the arrays in question look like this:

import numpy as np

x = np.array([ 120. ,  121.5,  122. ,  122.5,  123. ,  123.5,  124. ,  124.5,
        125. ,  125.5,  126. ,  126.5,  127. ,  127.5,  128. ,  128.5,
        129. ,  129.5,  130. ,  130.5,  131. ,  131.5,  132. ,  132.5,
        133. ,  133.5,  134. ,  134.5,  135. ,  135.5,  136. ,  136.5,
        137. ,  137.5,  138. ,  138.5,  139. ,  139.5,  140. ,  140.5,
        141. ,  141.5,  142. ,  142.5,  143. ,  143.5,  144. ,  144.5,
        145. ,  145.5,  146. ,  146.5,  147. ])

y = np.array([  1.25750000e+01,   1.10750000e+01,   1.05750000e+01,
         1.00750000e+01,   9.57500000e+00,   9.07500000e+00,
         8.57500000e+00,   8.07500000e+00,   7.57500000e+00,
         7.07500000e+00,   6.57500000e+00,   6.07500000e+00,
         5.57500000e+00,   5.07500000e+00,   4.57500000e+00,
         4.07500000e+00,   3.57500000e+00,   3.07500000e+00,
         2.60500000e+00,   2.14500000e+00,   1.71000000e+00,
         1.30500000e+00,   9.55000000e-01,   6.65000000e-01,
         4.35000000e-01,   2.70000000e-01,   1.55000000e-01,
         9.00000000e-02,   5.00000000e-02,   2.50000000e-02,
         1.50000000e-02,   1.00000000e-02,   1.00000000e-02,
         1.00000000e-02,   1.00000000e-02,   1.00000000e-02,
         1.00000000e-02,   1.00000000e-02,   5.00000000e-03,
         5.00000000e-03,   5.00000000e-03,   5.00000000e-03,
         5.00000000e-03,   5.00000000e-03,   5.00000000e-03,
         5.00000000e-03,   5.00000000e-03,   5.00000000e-03,
         5.00000000e-03,   5.00000000e-03,   5.00000000e-03,
         5.00000000e-03,   5.00000000e-03])

I currently then have f(x) = y, and I want d^2 y / dx^2.

Numerically, I know I can either interpolate the function and take the derivative analytically or use higher order finite-differences. I think that there is enough data to use either, if one or the other is considered faster, more accurate, etc.

I have looked at np.interp() and scipy.interpolate with no success, as this returns me a fitted (linear or cubic) spline, but don't know how to get the derivative at that point.

Any guidance is much appreciated.

Did you have a look at np.diff?

mkhanoyan
– mkhanoyan

2016-10-24 20:16:28 +00:00
Commented Oct 24, 2016 at 20:16 — mkhanoyan
– mkhanoyan, Commented Oct 24, 2016 at 20:16
My concern is that my data points are not evenly spaced.

Jared
– Jared

2016-10-24 20:18:51 +00:00
Commented Oct 24, 2016 at 20:18 — Jared
– Jared, Commented Oct 24, 2016 at 20:18

Community · Accepted Answer · 2020-06-20 09:12:55Z

28

You can interpolate your data using scipy's 1-D Splines functions. The computed spline has a convenient derivative method for computing derivatives.

For the data of your example, using UnivariateSpline gives the following fit

import matplotlib.pyplot as plt
from scipy.interpolate import UnivariateSpline

y_spl = UnivariateSpline(x,y,s=0,k=4)

plt.semilogy(x,y,'ro',label = 'data')
x_range = np.linspace(x[0],x[-1],1000)
plt.semilogy(x_range,y_spl(x_range))

The fit seems reasonably good, at least visually. You might want to experiment with the parameters used by UnivariateSpline.

The second derivate of the spline fit can be simply obtained as

y_spl_2d = y_spl.derivative(n=2)

plt.plot(x_range,y_spl_2d(x_range))

The outcome appears somewhat unnatural (in case your data corresponds to some physical process). You may either want to change the spline fit parameters, improve your data (e.g., provide more samples, perform less noisy measurements), or decide on an analytic function to model your data and perform a curve fit (e.g., using sicpy's curve_fit)

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Oct 24, 2016 at 20:47

Stelios

5,5811 gold badge21 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Jared Over a year ago

This data is supposed to represent a probability density function. What are the best methods to normalize this curve and apply some rules (like no negative values), etc?

Stelios Over a year ago

I don't think there is a standard answer to that, since the approach of using a generic interpolation method has limited options in imposing constraints. In principle, you will need to formulate and solve a constrained optimization problem from scratch. You might want to start by normalizing your data since y_spl.integral(x[0],x[-1]) is about 80, which, of course, is not a valid value for a pdf.

Galen Over a year ago

How does this answer differ from using ratios of np.diff twice? Better or worse numerically?

manu190466 · Accepted Answer · 2016-10-24 20:51:23Z

14

By finite differences, the first order derivative of y for each mean value of x over your array is given by :

dy=np.diff(y,1)
dx=np.diff(x,1)
yfirst=dy/dx

And the corresponding values of x are :

xfirst=0.5*(x[:-1]+x[1:])

For the second order, do the same process again :

dyfirst=np.diff(yfirst,1)
dxfirst=np.diff(xfirst,1)
ysecond=dyfirst/dxfirst

xsecond=0.5*(xfirst[:-1]+xfirst[1:])

answered Oct 24, 2016 at 20:51

manu190466

1,6031 gold badge12 silver badges19 bronze badges

1 Comment

Blaze Over a year ago

np.diff(np.diff([x*x for x in range(0,10)])) = [2,2,2..]

Collectives™ on Stack Overflow

Second Derivative in Python - scipy/numpy/pandas

2 Answers 2

3 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related