3

I am having trouble streaming a large result-set from a MySQL database using clojure.java.jdbc. This is what I'm trying now:

(defn etl! [query result-set-fn]
  (jdbc/with-db-transaction [t-conn db-spec]
    (let [conn (jdbc/get-connection t-conn)
          statement (jdbc/prepare-statement conn query
                                            :fetch-size Integer/MIN_VALUE
                                            :concurrency :read-only
                                            :result-type :forward-only)]
      (jdbc/query conn [statement] :result-set-fn first))))

I have also tried it without the transaction and with (.setAutoCommit conn False) (which is necessary for Postgres). According to the MySQL docs, setting that fetch-size, concurrency and result-type should tell MySQL to deliver the results one at a time, but that doesn't seem to be happening; the query hangs and heap consumption rises steadily for at least hundreds of megabytes.

A similar question was asked a few years ago, but the answer has now been deprecated by clojure.java.jdbc: Streaming from MySQL with clojure.java.jdbc

Any ideas?

2
  • I have not played with mysql but have used Postgres recently. Using either :result-set-fn or :row-fn did work lazily for me in that instance. You may wish to try a minimal test with Postgres and see if that solves the problem. Also, for reads, I don't see any benefit to (with-db-transaction ...) Commented Jul 5, 2015 at 22:21
  • Yes, I do this more or less exactly in postgres at work and it works perfectly well. It may be an issue with the MySQL drivers. The only reason I tried the transaction was that the other answer linked above recommended it, and given the amount of ceremony and incantations that are necessary to ostensibly get MySQL to give you things one at a time (MIN_VALUE, etc) it seemed plausible. I'm using MySQL in this case because I'm working with wikipedia data, but maybe it's worth the hassle to just migrate it over to postgres at this point. Commented Jul 6, 2015 at 0:08

1 Answer 1

0

This is very similar to this question: clojure.java.jdbc lazy query which has several answers, including an up-to-date one from me that refers to the recently added reducible-query. I haven't directly confirmed what combination of options you need to persuade MySQL to stream result sets but :fetch-size and, perhaps, the various cursor settings might be useful. PostgreSQL requires :auto-commit? false as well -- I'm not sure whether you might also need that for MySQL.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.