1

Actually I realize that it works well for some categories, like

step %{I go to "https://newyork.craigslist.org/search/spa?s=#{emails}"}

but not for others, like

 # step %{I go to "https://newyork.craigslist.org/search/fbh?s=#{emails}"}

My function was working well for a few days, then suddenly it started giving out this error: Net::ReadTimeout (Net::ReadTimeout) right when i = 120.

Is there anything I can do to fix this?

Given(/^I go to "([^"]*)"?/) do |url|
  visit(url)
end

Given("I save all emails") do
  emails = 0
  i = 119
  until emails >= 500
      until i == 120
          fetch_emails(i, emails)
          i += 1
      end
      click_next_button
      emails += 120
      puts emails
      i = 1
      puts i
    end
end

def fetch_emails(i, emails)
      find(:xpath, "(//a[@class='result-title hdrlnk'])[#{i}]").click
      if Capybara.has_xpath?("//button[@class='reply-button js-only']")
        find(:xpath, "//button[@class='reply-button js-only']").click
        sleep(1)
        if Capybara.has_xpath?("//p[@class='reply-email-address']")
          # puts find(:xpath, "//p[@class='reply-email-address']//a").text
          open('RESULTS.csv', 'a') do |f|
            f << find(:xpath, "//p[@class='reply-email-address']//a").text + "\n"
          end
        end
      end
      # step %{I go to "https://newyork.craigslist.org/search/fbh?s=#{emails}"}
       step %{I go to "https://newyork.craigslist.org/search/rfh?s=#{emails}"}
      # step %{I go to "https://newyork.craigslist.org/search/lab?s=#{emails}"}
#      step %{I go to "https://newyork.craigslist.org/search/spa?s=#{emails}"}
      # step %{I go to "https://newyork.craigslist.org/search/trd?s=#{emails}"}
end

def click_next_button
    first(".next").click
    sleep(2)
end
2
  • 1
    It sounds like craigslist may be throttling you for violating its ToS by harvesting email addresses. Commented Aug 26, 2019 at 3:57
  • I resolved the Net::ReadTimeout error by identifying that it occurred during email sending functionality not working. When I added the correct smtp it worked for me properly. Commented Feb 21 at 8:35

1 Answer 1

1

If your chrome is upgrade to latest versions then use below capabilities

capabilities = Selenium::WebDriver::Remote::Capabilities.chrome(
    chromeOptions: {
      args: %w[
        headless disable-gpu no-sandbox
        --window-size=1980,1080 --enable-features=NetworkService,NetworkServiceInProcess
      ]
    }
  )


Capybara::Selenium::Driver.new app, browser: :chrome, desired_capabilities: capabilities
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.