2

I have been wondering how Ruby integrates with HTTP servers like Apache. As a Java developer I understand the full stack of a servlet container and how the whole lifecycle works and how you could have a servlet or filter integrate different web frameworks into a standardized container HTTP lifecycle.

That being said I was never clear on how other frameworks like Ruby managed to hook into the apache server. Is it via a C api? Some other standard?

2
  • 1
    Phusion Passenger's Design and Architecture document has a "Common Models" section that provides a great overview of the various ways this is done. Commented Jan 11, 2017 at 19:14
  • If I've answered your query, do mark the answer as correct Commented Jan 12, 2017 at 12:05

3 Answers 3

4

There are many different ways to do that.

CGI

The oldest way is the Common Gateway Interface (CGI), introduced in 1993 in the NCSA webserver and supported by every webserver since. It is an extremely simple protocol: the webserver sets a standardized set of environment variables containing information about the request, calls a program, that program can then examine the environment variables and write its response to stdout, which the webserver simply sends back. The next request starts the cycle (and the program) all over again.

Here's an extremely simple CGI example in Ruby (I don't have Apache installed, so I couldn't test this):

#!/usr/bin/env ruby

# We need to send the header as well, the webserver won't do it for us
puts 'Content-type: text/html'
puts

puts <<~"HTML"
  <!DOCTYPE html>
  <html lang="en">
    <head>
      <meta charset="utf-8">
      <title>CGI Demo Script in Ruby</title>
    </head>
    <body>
      <p>
        My name is #{ENV['SCRIPT_NAME']} and you called me with the path #{ENV['PATH_INFO']} and query string #{ENV['QUERY_STRING']}.
     </p>
    </body>
  </html>
HTML

FastCGI

Since CGI starts a new process for every single request, and communication via environment variables and stdout isn't exactly efficient either, FastCGI was created in 1996. The idea is similar to CGI, but the handler process is not started for every request, instead the handler process stays in memory and handles the requests in a loop, and the communication between the webserver and the handler uses an efficient binary protocol.

Communication uses either Unix sockets or TCP.

SCGI

The Simple Common Gateway Interface (SCGI), published 2006, is similar to FastCGI, but simpler and uses a text-based communication protocol (based on netstrings) instead of a binary one. It also avoids the overhead of firing up a process for every request, instead the handler program keeps running and simply processes the requests in a loop.

In-process modules

Almost every webserver has a module API that allows you to add your own code to the running webserver process. One of the earliest ones was the Netscape NSAPI, which appeared in the mid-90s and was a direct competitor to CGI, with FastCGI OTOH being developed as a reaction to it. IIS, Apache, Nginx, etc. all also have similar APIs. These differ from FastCGI and SCGI mainly in the fact that they don't run as a separate process but rather within the webserver process. This gives them maximum performance and maximum access to the webserver internals, but also maximum potential for damage.

There are/were a couple of modules for running a Ruby execution engine inside a webserver. mod_ruby was an Apache module for embedding MRI inside Apache, mod_rubinius did the same thing for Rubinius. IronRuby could run inside IIS as a plugin.

This is the solution that is classically employed by PHP (mod_php) and Perl (mod_perl). There are also mod_python, even mod_mono and mod_java.

Webserver interfaces

Webserver interfaces provide a standardized interface between web frameworks and webservers. Actually, Servlets could be seen as a distant relative.

The most well-known interface in the Ruby world is Rack, but before there was Rack in Ruby, there were already WSGI (Web Server Gateway Interface) in Python, PSGI in Perl, and so on. The main idea behind such interfaces is that they provide a narrow, simple interface that decouples webframeworks and webservers. E.g. any Rack-compliant framework (Rails, Sinatra, Padrino, Cuba, Lotus, Ramaze, NYNY, Nancy, Crepe, Hobbit, Grape, Roy, …) will work with any Rack-compliant webserver (Mongrel, Thin, Unicorn, Puma, …) For example, before there was Rack, Rails had different code for running under Lighttpd, SCGI, FastCGI, CGI, and Webrick. Now, there's just Rack.

For webservers which are not themselves Rack-compliant (e.g. because they were created before Rack or because they are language-agnostic), there are ways to implement Rack on top of them using any of the above-mentioned techniques, e.g. there is a mod_rack for Apache and Nginx.

Phusion Passenger

Phusion Passenger is a special-case of this. It is a Rack (and Python WSGI) server implemented as an extension module to Apache and Nginx.

Native webserver

Instead of running the language inside the webserver, you can also run the webserver inside the language, e.g. Webrick is a webserver written in Ruby that can run Rails.

Sign up to request clarification or add additional context in comments.

Comments

1

Ruby cannot work with Apache alone, Apache is a web server and Ruby needs an application server as it's an interpreted language. Wikipedia's "Interpreted language" page has more about interpreted languages.

Ruby works with Apache in conjunction with an application server like Passenger. Read Digital Ocean's "How To Deploy a Rails App with Passenger and Apache on Ubuntu 14.04" tutorial for more information.

I have personally deployed Ruby apps with Passenger Phusion + Nginx as well.

Comments

1

As with all things, there are several ways to do it. Phusion Passenger provides Apache and NGINX modules, which are relatively comparable to e.g. mod_php and mod_perl. It's also available in standalone mode, which includes its own web server (i.e. no Apache or NGINX required). A good comparison between the three modes is available here.

Tiered web architectures will take Passenger in standalone mode and serve it from behind a reverse proxy, which can be anything from NGINX or Apache to other middlewares like HAProxy or Varnish. You might be familiar with this model because it is similar to how Jetty, Tomcat, etc. serve Java applications to the public Internet (i.e. from behind a reverse proxy). Advanced architectures may consist of multiple reverse proxies and application servers, each solving a different piece of the multi-tiered web architecture puzzle (e.g. SSL termination, load balancing, caching, compression, static asset serving, dynamic content).

I actually prefer Puma, which is analogous to Passenger standalone. It's easier to understand and (last time I checked) faster and more memory-efficient in many benchmarks. It gives you the option of a hybrid forking+threading mode that (again, last time I checked) is only available in Passenger Enterprise. Puma is the default web server in Rails and on Heroku and other hosting platforms.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.