4

can you suggest me the shortest and easiest way for extracting substring from string sequence? I'm getting this collection from using enlive framework, which takes content from certain web page, and here is what I am getting as result:

("background-image:url('http://s3.mangareader.net/cover/gantz/gantz-r0.jpg')"
 "background-image:url('http://s3.mangareader.net/cover/deadman-wonderland/deadman-wonderland-r0.jpg')"
 "background-image:url('http://s3.mangareader.net/cover/12-prince/12-prince-r1.jpg')" )

What I would like is to get some help in extracting the URL from the each string in the sequence.i tried something with partition function, but with no success. Can anyone propose a regex, or any other approach for this problem?

Thanks

2 Answers 2

5

re-seq to the resque!

(map #(re-seq #"http.*jpg" %) d)
(("http://s3.mangareader.net/cover/gantz/gantz-r0.jpg")  
("http://s3.mangareader.net/cover/deadman-wonderland/deadman-wonderland-r0.jpg") 
("http://s3.mangareader.net/cover/12-prince/12-prince-r1.jpg"))
user> 

re-find is even better:

user> (map #(re-find #"http.*jpg" %) d)
("http://s3.mangareader.net/cover/gantz/gantz-r0.jpg" 
 "http://s3.mangareader.net/cover/deadman-wonderland/deadman-wonderland-r0.jpg" 
 "http://s3.mangareader.net/cover/12-prince/12-prince-r1.jpg")

because it doesn't add an extra layer of seq.

Sign up to request clarification or add additional context in comments.

4 Comments

Hi, great solution, thanks a ton !!!! By the way , can you recommend me a tutorial for regex in clojure, in order to avoid asking for help each time I have to tangle with them?
I would love to find that as well, anyone?
The best thing I can think of is calling (find-doc #"^re-") and reading the results. I don't know of anything better.
For regex syntax itself, worth noting that it is the same as Java, so any of the many Java tutorials are relevant e.g. tutorials.jenkov.com/java-regex/syntax.html
2

Would something simple like this work for you?

(defn extract-url [s]
  (subs s (inc (.indexOf s "'")) (.lastIndexOf s "'")))

This function will return a string containing all the characters between the first and last single quotes.

Assuming your sequence of strings is named ss, then:

(map extract-url ss)
;=> ("http://s3.mangareader.net/cover/gantz/gantz-r0.jpg"
;    "http://s3.mangareader.net/cover/deadman-wonderland/deadman-wonderland-r0.jpg"
;    "http://s3.mangareader.net/cover/12-prince/12-prince-r1.jpg")

This is definitely not a generic solution, but it fits the input you have provided.

1 Comment

Yup, it works like a charm. I thought this can be done more elegantly by regex, but this is way more comprehensive to OO guy like me. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.