0

I have numpy.array data set from a simulation, but I'm missing the point at the edge (x=0.1), how can I interpolate/extrapolate the data in z to the edge? I have:

x = [ 0.  0.00667  0.02692  0.05385  0.08077]
y = [ 0.     10.     20.     30.     40.     50.]

#       0.      0.00667 0.02692 0.05385 0.08077 
z = [[ 25.     25.     25.     25.     25.   ]   # 0.
     [ 25.301  25.368  25.617  26.089  26.787]   # 10.
     [ 25.955  26.094  26.601  27.531  28.861]   # 20.
     [ 26.915  27.126  27.887  29.241  31.113]   # 30.
     [ 28.106  28.386  29.378  31.097  33.402]   # 40.
     [ 29.443  29.784  30.973  32.982  35.603]]  # 50.

I want to add a new column in z corresponding to x = 0.1 so that my new x will be

x_new = [ 0.  0.00667  0.02692  0.05385  0.08077  0.1]

#       0.      0.00667 0.02692 0.05385 0.08077 0.01
z = [[ 25.     25.     25.     25.     25.       ?   ]   # 0.
     [ 25.301  25.368  25.617  26.089  26.787    ?   ]   # 10.
     [ 25.955  26.094  26.601  27.531  28.861    ?   ]   # 20.
     [ 26.915  27.126  27.887  29.241  31.113    ?   ]   # 30.
     [ 28.106  28.386  29.378  31.097  33.402    ?   ]   # 40.
     [ 29.443  29.784  30.973  32.982  35.603    ?   ]]  # 50.

Where all '?' replaced with interpolated/extrapolated data. Thanks for any help!

1
  • The extrapolated value would depend on whether the relation of the progressive elements in the sequence are linear or logarithmic. If the former, you could just multiply the penultimate column by your constant, if the latter, then you'd have to look at the progression between columns and determine a suitable value. Also, this is definitely extrapolation. Commented Sep 14, 2016 at 14:53

1 Answer 1

2

Have you had a look at scipy.interpolate2d.interp2d (which uses splines)?

from scipy.interpolate import interp2d
fspline = interp2d(x,y,z) # maybe need to switch x and y around
znew = fspline([0.1], y)
z = np.c_[[z, znew] # to join arrays

EDIT:

The method that @dnalow and I are imagining is along the following lines:

import numpy as np
import matplotlib.pyplot as plt

# make some test data
def func(x, y):
    return np.sin(np.pi*x) + np.sin(np.pi*y)

xx, yy = np.mgrid[0:2:20j, 0:2:20j]
zz = func(xx[:], yy[:]).reshape(xx.shape)

fig, (ax1, ax2, ax3, ax4) = plt.subplots(1,4, figsize=(13, 3))
ax1.imshow(zz, interpolation='nearest')
ax1.set_title('Original')

# remove last column
zz[:,-1] = np.nan
ax2.imshow(zz, interpolation='nearest')
ax2.set_title('Missing data')

# compute missing column using simplest imaginable model: first order Taylor
gxx, gyy = np.gradient(zz[:, :-1])
zz[:, -1] =  zz[:, -2] + gxx[:, -1] + gyy[:,-1]
ax3.imshow(zz, interpolation='nearest')
ax3.set_title('1st order Taylor approx')

# add curvature to estimate
ggxx, _ = np.gradient(gxx)
_, ggyy = np.gradient(gyy)
zz[:, -1] = zz[:, -2] + gxx[:, -1] + gyy[:,-1] + ggxx[:,-1] + ggyy[:, -1]
ax4.imshow(zz, interpolation='nearest')
ax4.set_title('2nd order Taylor approx')

fig.tight_layout()
fig.savefig('extrapolate_2d.png')

plt.show()

enter image description here

You could improve the estimate by
(a) adding higher order derivatives (aka Taylor expansion), or
(b) computing the gradients in more directions than just x and y (and then weighting the gradients accordingly).

Also, you will get better gradients if you pre-smooth the image (and now we have a complete Sobel filter...).

Sign up to request clarification or add additional context in comments.

5 Comments

I tried your suggestion but what happens is that it just takes the last value in each row and duplicates it into another column, probably because I try to go outside the original data and interp2d only works in between values?
Since this is extrapolation, interpolation cannot help. Certainly not interp2d. Furthermore, extrapolation only makes sense if a model for the data exists on whose basis one can extrapolate. If not, it is not clear how to obtain the best values. Therefore, this is more than just a numerical problem.
Yeah, should have remembered that interp2d evaluates to nearest value at the edges (or nan). I think @dnalow is right: you need to compute some local gradients, maybe even the curvature and make an explicit model.
In principle, if nothing is known, the nearest value is actually the most reliable one. Therefore I think the answer deserves +1. Assume some 'nice' behavior of the function, one can use a polynomial approximation to continue the data similar to a Taylor expansion: stackoverflow.com/questions/33964913/… stackoverflow.com/questions/7997152/…
Thank you Paul for an excellent explanation! That is exactly what i need(, except only in one dimension for this particular case). And thank you dnalow for your interesting comments, I'll keep that in mind! Fortunately the data represents a physical phenomena with 'smooth' gradients, so I believe Paul's solution works for me in this case.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.