3

There is a great bookmarklet script that takes a HTML document and, using javascript, strips out the main article content (like Instapaper, but better).

I want to know the most efficient way to use this same javascript script on the server side with Rails 3.

Is it even possible? The ideal would be to be able to request a URL from the server (in Rails) and then parse the response using the javascript, and return the processed text (and then persist it to a db).

I was thinking of just adapting the script in Ruby, but this seems silly, especially since jQuery and javascript itself have a bunch of built-in functions for parsing a DOM. On the other hand, the script uses DOM constructions from the browser, so it might require a server-side browser?

Any suggestions?

3

3 Answers 3

2

We actually do this very thing in one of our webapps. If you want to implement this functionality server-side in your Ruby on Rails application, your best bet is to use a Ruby HTML/XML parsing library, such as Nokogiri.

I wrote an article specifically explaining how to strip out the important information from a linked webpage, like Instapaper does, using Ruby + Nokogiri.

Create a Printable Format for Any Webpage with Ruby and Nokogiri

Sign up to request clarification or add additional context in comments.

1 Comment

Awesome. This makes total sense. Thanks!
0

Maybe run the script in something like Rhino Shell and capture the output?

Comments

0

Node.js comes to mind when speaking about server-side Javascript.

I think the Javascript stuff of readability could also be translated into Ruby but this would probably require some serious amount of work.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.