7

I'm new to clojure and I need some examples. Please show me how to parse html file using clojure?

1

3 Answers 3

18

Enlive is a great tool for this. In short:

(ns foo.bar
  (:require [net.cgrand.enlive-html :as html]))

(defn fetch-page [url]
  (html/html-resource (java.net.URL. url)))

Here is a nice tutorial on using it both as a scraper/parser and as a template engine:

Here is a short example of scraping a page.

Another option is clj-tagsoup. Enlive also uses tagsoup, but in addition has a pluggable parser so you can add support for other parsers.

Sign up to request clarification or add additional context in comments.

2 Comments

Can I parse html file without envile or another parser, using only clojure?
Well, you can get the content of a web page as a string by only doing: (slurp "example.com"), but in order to work with the content in a manageable way you need a parser (like enlive).
4

Clojure's xml parsing library is there for you.

Parses and loads the source s, which can be a File, InputStream or String naming a URI. Returns a tree of the xml/element struct-map, which has the keys :tag, :attrs, and :content. and accessor fns tag, attrs, and content. Other parsers can be supplied by passing startparse, a fn taking a source and a ContentHandler and returning a parser

Or use enlive, it's framework fully on clojure or use Java based HtmlCleaner.

Comments

1

HTML Parsers

source - https://www.clojure-toolbox.com

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.