Execution time of Clojure program

Question

This is a clojure program to read integers from a file and count the number of inversions i.e the number of times a larger number appears before a smaller number in a sequence. It implements a O(n^2) algorithm, and takes about 30 minutes to run on an input of size 100,000.

(defn count-inversions
  [file]
  (let [int-list (vec (map #(Integer/parseInt %)
                           (clojure.string/split (slurp file) #"\r\n")))]
    (loop [c 0
           t-list int-list]
      (let [size (count t-list)]
        (if (empty? t-list)
         c
         (recur (reduce #(if (< %2 (first t-list)) (inc %1) %1) c (subvec t-list 1 (dec  size)))
                (subvec t-list 1 (dec size))))))))

This algorithm when implemented in Java takes only a few seconds to complete. Why such a large difference?

Try calling time on various parts to see where the time is being spent. Using a long-array and indexes might speed things up, but I can't see it being 4 orders of magnitude different. — Alex Taggart
– Alex Taggart, Commented Mar 25, 2012 at 7:13

mikera · Accepted Answer · 2012-03-25 09:53:01Z

5

Things that look potentially slow to me (though you'll have to benchmark to be sure....)

All your numbers are boxed. This will make anything you do much slower than using primitives. It isn't very idiomatic but you can use primitive arrays in Clojure if you want to get this speedup.
Reducing over a subvec currently creates a lot of temporary objects. There is a patch in the works to fix this (http://dev.clojure.org/jira/browse/CLJ-894) but you might have to wait for Clojure 1.4 or 1.5 for this.
It helps to do (set! *unchecked-math* true) to improve performance of integer operations (obviously, you need to be a bit more careful if overflows might happen)

If I wanted to get this to run really fast in Clojure I'd probably resort to primitive arrays and looping with primitive indexes:

(set! *unchecked-math* true)

(defn count-inversions [coll]
     (let [^ints integers (int-array coll)
           size (long (count integers))]
       (loop [c (long 0)
              i (long 0)]
         (if (>= i size)
           c
           (recur
             (loop [c (long c)
                    v (aget integers i)
                    j (inc i)]
               (if (>= j size)
                 c
                 (recur (long (if (> v (aget integers j)) (inc c) c)) v (inc j))))
             (inc i))))))

(time (count-inversions (for [i (range 100000)] (rand-int 100))))
"Elapsed time: 16036.651863 msecs"

i.e. this code counts all the inversions in 100,000 integers in about 16 seconds on my machine (with Clojure 1.3)

edited Mar 25, 2012 at 9:53

answered Mar 25, 2012 at 9:25

mikera

107k28 gold badges265 silver badges427 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

octopusgrabbus Over a year ago

Would you post a link to the definition of a boxed number? I'm curious about the term. Thank you.

mikera Over a year ago

It basically means that a number is being packaged up as an Object: lots of links on Google, here is one: docs.oracle.com/javase/1.5.0/docs/guide/language/…

Collectives™ on Stack Overflow

Execution time of Clojure program

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related