6

I'm trying to use boilerpipe from JRuby. I've seen the guide for calling Java from JRuby, and have used it successfully with another Java package, but can't figure out why the same thing isn't working with boilerpipe.

I'm trying to basically do the equivalent of this Java from JRuby:

URL url = new URL("http://www.example.com/some-location/index.html");
String text = ArticleExtractor.INSTANCE.getText(url);

Tried this in JRuby:

require 'java'
url = java.net.URL.new("http://www.example.com/some-location/index.html")
text = Java::DeL3sBoilerpipeExtractors::ArticleExtractor.INSTANCE.getText(url)

This is based on the API Javadocs for boilerpipe. Here's the error:

jruby-1.6.0 :042 > Java::DeL3sBoilerpipeExtractors::ArticleExtractor
NameError: cannot load Java class deL3sBoilerpipeExtractors.ArticleExtractor
        from org/jruby/javasupport/JavaClass.java:1195:in `for_name'
        from org/jruby/javasupport/JavaUtilities.java:34:in `get_proxy_class'
        from /usr/local/rvm/rubies/jruby-1.6.0/lib/ruby/site_ruby/shared/builtin/javasupport/java.rb:45:in `const_missing'
        from (irb):42:in `evaluate'
        from org/jruby/RubyKernel.java:1087:in `eval'
        from /usr/local/rvm/rubies/jruby-1.6.0/lib/ruby/1.8/irb.rb:158:in `eval_input'
        from /usr/local/rvm/rubies/jruby-1.6.0/lib/ruby/1.8/irb.rb:271:in `signal_status'
        from /usr/local/rvm/rubies/jruby-1.6.0/lib/ruby/1.8/irb.rb:270:in `signal_status'
        from /usr/local/rvm/rubies/jruby-1.6.0/lib/ruby/1.8/irb.rb:155:in `eval_input'
        from org/jruby/RubyKernel.java:1417:in `loop'
        from org/jruby/RubyKernel.java:1190:in `catch'
        from /usr/local/rvm/rubies/jruby-1.6.0/lib/ruby/1.8/irb.rb:154:in `eval_input'
        from /usr/local/rvm/rubies/jruby-1.6.0/lib/ruby/1.8/irb.rb:71:in `start'
        from org/jruby/RubyKernel.java:1190:in `catch'
        from /usr/local/rvm/rubies/jruby-1.6.0/lib/ruby/1.8/irb.rb:70:in `start'
        from /usr/local/rvm/rubies/jruby-1.6.0/bin/irb:17:in `(root)'

Looks like it didn't parse the camelcase into the appropriate Java package name. What am I doing wrong? I believe I've set up my classpath alright (last 3 entries), though there may be some conflict with xerces possibly being included twice:

$ echo $CLASSPATH
:/jellly/Maui1.2:/jellly/Maui1.2/src:/jellly/Maui1.2/bin:/jellly/Maui1.2/lib/commons-io-1.4.jar:/jellly/Maui1.2/lib/commons-logging.jar:/jellly/Maui1.2/lib/icu4j_3_4.jar:/jellly/Maui1.2/lib/iri.jar:/jellly/Maui1.2/lib/jena.jar:/jellly/Maui1.2/lib/maxent-2.4.0.jar:/jellly/Maui1.2/lib/mysql-connector-java-3.1.13-bin.jar:/jellly/Maui1.2/lib/opennlp-tools-1.3.0.jar:/jellly/Maui1.2/lib/snowball.jar:/jellly/Maui1.2/lib/trove.jar:/jellly/Maui1.2/lib/weka.jar:/jellly/Maui1.2/lib/wikipediaminer1.1.jar:/jellly/Maui1.2/lib/xercesImpl.jar:/jellly/boilerpipe-1.1.0/boilerpipe-1.1.0.jar:/jellly/boilerpipe-1.1.0/lib/nekohtml-1.9.13.jar:/jellly/boilerpipe-1.1.0/lib/xerces-2.9.1.jar

2 Answers 2

11

I'd recommend against trying to guess the module name we put under Java::, since for unusual packages it can get mangled pretty badly. Use java_import 'your.weird.package.ArticleExtractor' or if all the package components are compatible with Ruby method naming, you can also do Java::your.weird.package.ArticleExtractor.

Also, since you might run into this... you'll want to reference the INSTANCE variable as ArticleExtractor::INSTANCE, since we map it as a Ruby constant.

Have fun!

Sign up to request clarification or add additional context in comments.

3 Comments

Apologies for creating packages starting with "de.l3s". L3S is the name of the company/research center I was working for when developing the boilerpipe library http://www.L3S.de/
You gotta love StackOverflow. Ask a question about using a library from JRuby and you get the lead developer of JRuby and the creator of the library answering! (Within 6 minutes of asking the question, no less.)
I only answer the easy ones that quickly :)
0

You can also use the nice Jruby Boilerpipe Gem which wraps the Java code

Or the pure ruby implementation of Boilerpipe Ruby Boilerpipe Gem

require 'boilerpipe'  

Boilerpipe::Extractors::ArticleExtractor.text("https://github.com/jruby/jruby/wiki/AboutJRuby")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.