18

I have the following code which increments the first element of every pair in a vector:

(vec (map (fn [[key value]] [(inc key) value]) [[0 :a] [1 :b]]))

However i fear this code is inelegant, as it first creates a sequence using map and then casts it back to a vector.

Consider this analog:

(into [] (map (fn [[key value]] [(inc key) value]) [[0 :a] [1 :b]]))

On #[email protected] i was told, that using the code above is bad, because into expands into (reduce conj [] (map-indexed ...)), which produces many intermediate objects in the process. Then i was told that actually into doesn't expand into (reduce conj ...) and uses transients when it can. Also measuring elapsed time showed that into is actually faster than vec.

So my questions are:

  1. What is the proper way to use map over vectors?
  2. What happens underneath, when i use vec and into with vectors?

Related but not duplicate questions:

1 Answer 1

36

Actually as of Clojure 1.4.0 the preferred way of doing this is to use mapv, which is like map except its return value is a vector. It is by far the most efficient approach, with no unnecessary intermediate allocations at all.

Clojure 1.5.0 will bring a new reducers library which will provide a generic way to map, filter, take, drop etc. while creating vectors, usable with into []. You can play with it in the 1.5.0 alphas and in the recent tagged releases of ClojureScript.

As for (vec some-seq) and (into [] some-seq), the first ultimately delegates to a Java loop which pours some-seq into an empty transient vector, while the second does the same thing in very efficient Clojure code. In both cases there are some initial checks involved to determine which approach to take when constructing the final return value.

vec and into [] are significantly different for Java arrays of small length (up to 32) -- the first will alias the array (use it as the tail of the newly created vector) and demands that the array not be modified subsequently, lest the contents of the vector change (see the docstring); the latter creates a new vector with a new tail and doesn't care about future changes to the array.

Sign up to request clarification or add additional context in comments.

5 Comments

Thank you. (mapv f args) is significantly more concise than (into (vector) (map f args))
Is this answer stil actual? I'm mostly interested in ClojureScript. Thank you
mapv still exists and is the best way to go if you want to accumulate the results of mapping a function across a seqable collection in a vector. Reducers also still exist, however Clojure now has transducers which are great for building up vectors and other collections using more involved transformations – (into [] (map inc) input-coll) etc. Reducers are still the best option if you can benefit from parallelization via clojure.core.reducers/fold, but this is not relevant to ClojureScript.
Does this still hold for vec and "small arrays"? I did following experiment: ``` (def my-a (int-array [1 2 3])) (def my-v (vec my-a)) my-v ;=> [1 2 3] (aset my-a 1 10)) (java.util.Arrays/toString my-a) ;=> "[1, 10, 3]" ;=> my-v [1 2 3] ```
@JurajMartinka It does, but only for arrays of Object. Yours is a primitive array, so when you call vec on it, its items will all be boxed and copied into a new object array of the same length. This is because persistent vectors returned by vec and vector and created by the literal syntax […] always store object references. There's a distinct vector type built in that can hold primitive items, see clojure.core/vector-of; NB. currently using it generally entails boxing on access, so one should carefully assess whether it's appropriate and not default to it even with primitives.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.