0

Suppose I have a simple string that I want to parse into array of string:

"add (multiply (add 1 2) (add 3 4)) (add 5 6)"

How do I parse it into 3 strings (based on outer parentheses):

add
(multiply (add 1 2) (add 3 4))
(add 5 6)

With my OOP mind, I think I need a for loop index and if else statement to do this.

I have tried parse it with string split, however I got:

command
(multiply
1
(add
3
2))
(add
3
4)

which is not what I expected

1

3 Answers 3

1

since your data elements are already in the well formed polish notation, you can simply read it as edn, and operate on the clojure's data structures:

(def s "add (multiply (add 1 2) (add 3 4)) (add 5 6)")

(map str (clojure.edn/read-string (str "(" s ")")))

;;=> ("add" "(multiply (add 1 2) (add 3 4))" "(add 5 6)")

i'm still unaware of your end goal, but this seems to fulfill the asked one.

Sign up to request clarification or add additional context in comments.

Comments

0

Either you can use the build-in LispReader

(import '[clojure.lang LispReader LineNumberingPushbackReader])
(import '[java.io PushbackReader StringReader])

(defn could-read? [pr]
  (try
    (LispReader/read pr nil)
    true
    (catch RuntimeException e false)))

(defn paren-split2 [s]
  (let [sr (StringReader. s)
        pr (LineNumberingPushbackReader. sr)
        inds (loop [result [0]]
               (if (could-read? pr)
                 (recur (conj result (.getColumnNumber pr)))
                 result))
        len (count s)
        bounds (partition 2 1 inds)]
    (for [[l u] bounds
          :let [result (clojure.string/trim (subs s l (min len u)))] :when (seq result)]
      result)))

(paren-split2 "add (    multiply (   add      1 2) (add 3 4))   (add 5   6  )")
;; => ("add" "(    multiply (   add      1 2) (add 3 4))" "(add 5   6  )")

or you can hand-code a parser:

(def conj-non-empty ((remove empty?) conj))

(defn acc-paren-split [{:keys [dst depth current] :as state} c]
  (case c
    \( (-> state
           (update :depth inc)
           (update :current str c))
    \) (if (= 1 depth)
         {:depth 0 :dst (conj-non-empty dst (str current c)) :current ""}
         (-> state
             (update :depth dec)
             (update :current str c)))
    \space (if (zero? depth)
             {:depth 0 :dst (conj-non-empty dst current) :current ""}
             (update state :current str c))
    (update state :current str c)))

(defn paren-split [s]
  (:dst (reduce acc-paren-split
                {:dst []
                 :depth 0
                 :current ""}
                s)))

(paren-split "add (    multiply (   add      1 2) (add 3 4))   (add 5   6  )")
;; => ["add" "(    multiply (   add      1 2) (add 3 4))" "(add 5   6  )"]

Note: Either approach will preserve spaces in the input strings.

Comments

0

You could use read-string from clojure core to use the built-in reader of clojure. Here we read-in, use str to generated of the read-in chunk a string and subtract it from the string, clojure.string/trim the ends then, to start the cycle anew, until after trimming an empty string occurs. Then, the collected result is returned.

(defn pre-parse [s]
  (loop [s s
         acc []]
    (if (zero? (count s))
      acc
      (let* [chunk (read-string s)
             s_ (str chunk)
             rest-s (clojure.string/trim (subs s (count s_)))]
        (recur rest-s (conj acc s_))))))

recure takes its arguments, and calls loop on it with the arguments given in the order as loop takes them. We can test it with:

(def x "add (multiply (add 1 2) (add 3 4)) (add 5 6)")
(pre-parse x)
;; => ["add" "(multiply (add 1 2) (add 3 4))" "(add 5 6)"]

1 Comment

read-string is considered unsafe in general. clojure.edn/read-string is usually recommended in such cases

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.