3

I'm currently writing a script that iterates over a list of URLs and does some processing on them. One URL in my list is giving me a problem however. The code is as follows:

url = "https://secure.www.alumniconnections.com/olc/pub/CDB/events/attendance.cgi?   tmpl=attendance&event=2309515&sort=4"
uri = URI.parse(url)
response = Net::HTTP.get_response(uri)

The final line raises the following error:

EOFError: end of file reached
    from /usr/lib/ruby/1.8/net/protocol.rb:135:in `sysread'
    from /usr/lib/ruby/1.8/net/protocol.rb:135:in `rbuf_fill'
    from /usr/lib/ruby/1.8/timeout.rb:67:in `timeout'
    from /usr/lib/ruby/1.8/timeout.rb:101:in `timeout'
    from /usr/lib/ruby/1.8/net/protocol.rb:134:in `rbuf_fill'
    from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
    from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline'
    from /usr/lib/ruby/1.8/net/http.rb:2028:in `read_status_line'
    from /usr/lib/ruby/1.8/net/http.rb:2017:in `read_new'
    from /usr/lib/ruby/1.8/net/http.rb:1051:in `request'
    from /usr/lib/ruby/1.8/net/http.rb:948:in `request_get'
    from /usr/lib/ruby/1.8/net/http.rb:380:in `get_response'
    from /usr/lib/ruby/1.8/net/http.rb:543:in `start'
    from /usr/lib/ruby/1.8/net/http.rb:379:in `get_response'
    from (irb):5
    from /usr/lib/ruby/1.8/uri/ftp.rb:190

No other URLs in my list seem to be giving me any grief. Can anyone explain why I'm getting this error?

2 Answers 2

6

I typed in https://secure.www.alumniconnections.com/ which seemed to redirect me to http://www.harrisconnect.com/. My guess would be that your code is not able to handle the redirect. Try using Mechanize (http://mechanize.rubyforge.org/) to handle this. Also I would suggest that you wrap you code in some error handling such as:

# Prevent Infinite Loops
counter = 0

begin
  # Your Code Here

rescue EOFError
  puts "encountered EOFError"

  # Fail the connection after 3 attempts
  if counter < 3
     counter += 1
     puts "redo: #{counter}"
     redo
  else
     puts "FAILED CONNECTION #{counter} TIMES"
     counter = 0
  end
end

This will attempt to redo the connection which has helped me when connecting to a lot of urls in the past.

EDIT:

require 'rubygems'
require 'mechanize'

agent = Mechanize.new
html_text = agent.get("https://secure.www.alumniconnections.com/olc/pub/CDB/events/attendance.cgi?tmpl=attendance&event=2309515&sort=4").body

html_file = File.open("html_file.html", "w")
html_file.write(html_text)
html_file.close

This writes your webpage to a file just fine for me so give it a try.

Sign up to request clarification or add additional context in comments.

2 Comments

Could your first snippet potentially lead to an infinite loop?
Yes I believe it could. Somehow I never caught that. I will do a quick edit to fix that although I can't guarantee it will be the best solution.
0

If it's HTTPS and not just HTTP, you might try this (worked on Ruby 1.8.6):

require 'rubygems'
require "net/https"
require "uri"


address = "https://www.your-secure-domain-here.com"
uri = URI.parse(address)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
request = Net::HTTP::Get.new(uri.request_uri)
request.basic_auth("username", "password")
response = http.request(request)

In my example, instead of username and password I had to do SECRET-API-KEY and api_token.

Try it and see if it helps.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.