2

This is a newbie question. I have a function that parse a web page and return a series of 5 elements. I then use the println function to see if it worked correctly.

...
(defn select-first-index-page-elements [source element n]
    ((get-parsing-logic source "parsing-logic-index-page" element "final-touch-fn")
        (nth 
            (html/select 
                (fetch-first-page source)
                (get-parsing-logic source "parsing-logic-index-page" element "first-touch"))
            n)))

(defn parsing-source [source]
(loop [n 0]
    (when (< n (count-first-index-page-elements source "title"))
(println ; the group of elements:
    (select-first-index-page-elements source "date" n)
    " - "
    (select-first-index-page-elements source "title" n)
    " - "
    (select-first-index-page-elements source "url" n)
    "\n")
(recur (inc n)))))))

(parsing-source "events-directory-website")

Now, instead of a println function, how could I store those elements into a DB? And how I can not store a given group of element if it is already in the db? How can I print then only the new group of elements that the parsing function did find?

1 Answer 1

3

You might want to check out SQL Korma.

Using sql korma:

how could I store those elements into a DB?

(insert my-elements
  (values [{:elements ("a" "b" "c")}]))

And how I can not store a given group of element if it is already in the db?

;; using some elements youre looking for
(if-not [is-in-db (select my-elements
                          (where {:elements the-elements-youre-looking-for}))]
  (insert my-elements
      (values [{:elements the-elements-youre-looking-for}])))

How can I print then only the new group of elements that the parsing function did find? You could solve this using the (select ...) call in the above answer.

Hope that helps.

Sign up to request clarification or add additional context in comments.

3 Comments

I get CannotAcquireResourceException A ResourcePool could not acquire a resource from its primary factory or source. com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable (BasicResourcePool.java:1319) when instead of (println.. I put: (let [next-url (select-first-index-page-elements source "url" n)] (if-not [db (select events (where {:url next-url}))] (let [next-date (select-first-index-page-elements source "date" n) next-title (select-first-index-page-elements source "title" n)] (insert events (values [{:date next-date :title next-title :url next-url}])))))
Maybe checkout this. Make sure SQL is running, make sure you've declared the db somewhere in the code like here.
Also, the simplest example of db declaration is probably here under "Examples of generated queries:" where defdb is used.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.