0

I wrote the following program that extracts pairs of points that are a given distance and given elevation difference from each other from a list of XYZ points (columns 2 3 and 4 respective). The problem is it contains a nested for loop which I believe causes too many iterations and thus for large amounts of points (>1000) the routine takes an unreasonable time to complete. What can I do to optimise this algorithm?

Regards,

Amine

Sub Test1()

Columns("H:K").Select
Selection.ClearContents
c3 = 2
For c2 = 2 To 10

For c1 = c2 + 1 To 10
    If Sqr((Cells(c2, 2) - Cells(c1, 2)) ^ 2 + (Cells(c2, 3) - Cells(c1, 3)) ^ 2) = 1 Then
        If Abs(Cells(c1, 4) - Cells(c2, 4)) = 10 Then


                Cells(c3, 8) = Cells(c1, 2)
                Cells(c3, 9) = Cells(c1, 3)
                Cells(c3, 10) = Cells(c2, 2)
                Cells(c3, 11) = Cells(c2, 3)
                c3 = c3 + 1
        End If
    End If
Next c1

Next c2

End Sub
2
  • 1
    It will run much faster if you read the data into a variant array and do your calculatins without referencing the worksheet. Its like, block read, calculate, block write back to the output range. It may be worth the overhead of loading the variant array into a 2D array of Double (or Long if they are integers) first. Also c1, c2, and c3 should be declared explicitly as Long. They are currently variants which are slow. Commented Jul 23, 2014 at 15:13
  • yes performance code should always used typed variables instead of checking the type every time accessing it Commented Jul 23, 2014 at 15:48

2 Answers 2

1

You don't have lot of options to optimize your algorithm given you need to evaluate the distance and altitude between every point.

  1. You can do as Vikram Bhat said and sort your data in a 3d-tree but that mean taking computation time to build the tree and if you only use the tree once i'm not sure you will gain a lot of time.

  2. You can evaluate the distance faster by removing the Sqr().
    ((x2-x1)²+(y2-y1)²) = (distance)²
    Since you are looking for a fixed distance, it's faster to compute (distance)² one time then use the value in each if.
    In your case (with distance = 1 then distance² = 1) your test would become :

    ((Cells(c2, 2) - Cells(c1, 2)) ^ 2 + (Cells(c2, 3) - Cells(c1, 3)) ^ 2) = 1
    

    You can also use a distance approximation algorithm :
    http://en.wikibooks.org/wiki/Algorithms/Distance_approximations

  3. Another optimization would be swapping the two if conditions checking the elevation before the distance. Because this condition is faster to compute and may avoid having to compute distance it could be a nice speedup for your algorithm.

Your code modified :

Sub Test1()

Columns("H:K").Select
Selection.ClearContents
c3 = 2
For c2 = 2 To 10

For c1 = c2 + 1 To 10
    If ((Cells(c2, 2) - Cells(c1, 2)) ^ 2 + (Cells(c2, 3) - Cells(c1, 3)) ^ 2) = 1 Then
        If Abs(Cells(c1, 4) - Cells(c2, 4)) = 10 Then
                Cells(c3, 8) = Cells(c1, 2)
                Cells(c3, 9) = Cells(c1, 3)
                Cells(c3, 10) = Cells(c2, 2)
                Cells(c3, 11) = Cells(c2, 3)
                c3 = c3 + 1
        End If
    End If
Next c1

Next c2

End Sub
Sign up to request clarification or add additional context in comments.

Comments

0

You should consider using 3d-tree which help get nearest neighbour in O(logn+k) where k are valid neighbours , you can stop finding neighbour if they exceed the distance . This would work in O(n(logn + k)) instead of O(n^2) that your brute force algorithm gives.

7 Comments

Dear Vikram, thank you for your response. If you don't mind, I have 3 questions : 1) what do the variables O, n and k represent in the equations you have provided? 2) is the kd tree a programmable task using standard VB syntax or can it only be implemented through external means and 3)Could another optimisation involve referencing an array rather than cells or ranges?
@user32882 i dont know of any kd tree libraries in VB but that wont stop u as there are lot of such libraries in python , ruby , c++ from which you can spawn a process from VB and then recollect the output through a output file. I advice u to use already present libraries.
@user32882 The O , n ,k variable are not of importance to u if you dont know them as they are used in time complexity analysis which i think you unfamiliar with . I just mentioned them to tell you that how much faster the solution using kd tree can be so dont worry if you are not familiar with them.
@user32882 if cells are in memory then i dont think there is any improvement if you are using arrays so continue with cells.
@vikrambhat well, ok, but whatever the algorithm, it will run about ten times faster if you avoid the Cell references. As written, it will be horrendously slow. My second point would be that "if the cells are in memory" is a somewhat blur statement. Of course the cells are in memory, the issue is the process and the overhead for accessing them. But anyway the advise is there since there is a VBA tag.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.