I am trying to create a heightmap by interpolating between a bunch of heights at certain points in an area. To process the whole image, I have the following code snippet:
map_ = np.zeros((img_width, img_height))
for x in range(img_width):
for y in range(img_height):
map_[x, y] = calculate_height(set(points.items()), x, y)
This is calculate_height:
def distance(x1, y1, x2, y2) -> float:
return np.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2)
def calculate_height(points: set, x, y) -> float:
total = 0
dists = {}
for pos, h in points:
d = distance(pos[0], pos[1], x, y)
if x == pos[0] and y == pos[1]:
return h
d = 1 / (d ** 2)
dists[pos] = d
total += d
r = 0
for pos, h in points:
ratio = dists[pos] / total
r += ratio * h
r = r
return r
This snippet works perfectly, but if the image is too big, it takes a long time to process, because this is O(n^2). The problem with this, is that a "too big" image is 800 600, and it takes almost a minute to process, which to me seems a bit excessive.
My goal is not to reduce the time complexity from O(n^2), but to reduce the time it takes to process images, so that I can get decently sized images in a reasonable amount of time.
I found
this post, but I couldn't really try it out because it's all for a 1D array, and I have a 2D array, and I also need to pass the coordinates of each point and the set of existing points to the calculate_height function. What can I try to optimize this code snippet?
Edit: Moving set(points.items) out of the loop as @thethiny suggested was a HUGE improvement. I had no idea it was such a heavy thing to do. This makes it fast enough for me, but feel free to add more suggestions for the next people to come by!
Edit 2: I have further optimized this processing by including the following changes:
# The first for loop inside the calculate_distance function
for pos, h in points:
d2 = distance2(pos[0], pos[1], x, y)
if x == pos[0] and y == pos[1]:
return h
d2 = d2 ** -1 # 1 / (d ** 2) == d ** -2 == d2 ** -1
dists[pos] = d2 # Having the square of d on these two lines makes no difference
total += d2
This reduced execution time for a 200x200 image from 1.57 seconds to 0.76 seconds. The 800x600 image mentioned earlier now takes 6.13 seconds to process :D
This is what points looks like (as requested by @norok12):
# Hints added for easier understanding, I know this doesn't run
points: dict[tuple[int, int], float] = {
(x: int, y: int): height: float,
(x: int, y: int): height: float,
(x: int, y: int): height: float
}
# The amount of points varies between datasets, so I can't provide a useful range other than [3, inf)
calculate_height(set(points.items()), x, y)does. Could you elaborate on that? You may be able to vectorize this whole thing and get your processin times down by a factor of 30 or more.set(points.items())outside of the loop since that's always the same code. Second you should take this to the Code Review website where the people there are more prone to answering optimization help.calculate_heighton different points since they don't clash with each other. Learn aboutnumbaand it'll save you lots of time.calculate_height. It's just 2D interpolation