32

like DOMDocument class in PHP, is there any class in RUBY (i.e the core RUBY), to parse and get node elements value from a HTML Document.

4 Answers 4

49

There is no built-in HTML parser (yet), but some very good ones are available, in particular Nokogiri.

Meta-answer: For common needs like these, I'd recommend checking out the Ruby Toolbox site. You'll notice that Nokogiri is the top recommendation for HTML parsers

Sign up to request clarification or add additional context in comments.

Comments

9

You should check out hpricot. It's exceedingly good. It's not 'core' ruby, but it's a commonly used gem.

1 Comment

Hpricot sadly is no more. Nokogiri is now the preferred solution.
7

Ruby Cheerio - A jQuery style HTML parser in ruby. A most simplified version of Nokogiri for crawlers. This is the ruby version of most popular NodeJS package cheerio.

Follow the link for a simple crawler example.

gem install ruby-cheerio

require 'ruby-cheerio'

jQuery = RubyCheerio.new("<html><body><h1 class='one'>h1_1</h1><h1>h1_2</h1></body></html>")

jQuery.find('h1').each do |head_one|
    p head_one.text
end

# getting attribute values like jQuery.
p jQuery.find('h1.one')[0].prop('h1','class')

# function chaining similar to jQuery.
p jQuery.find('body').find('h1').first.text

2 Comments

Very good approach! Nice recommendation! Thanks @dineshsprabu.
Thanks Fernando Kosh
5

You can also try Oga by Yorick Peterse.

It is an XML/HTML parser written in Ruby that does not require system libraries such as libxml. You can find it here. https://github.com/YorickPeterse/oga

3 Comments

And now it's abandoned.
Oga seems well an live to me here: github.com/YorickPeterse/oga As of today, most recent commit was Sep 9, 2025. Am i missing something?
Check the date of the previous commit and the date of my comment. Also you can check the date of the latest gem release. So… yeah, there is a new one commit, and that's all for now for 2+ years.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.