How does the Ruby interpreter integrate with HTTP servers like Apache?

Question

I have been wondering how Ruby integrates with HTTP servers like Apache. As a Java developer I understand the full stack of a servlet container and how the whole lifecycle works and how you could have a servlet or filter integrate different web frameworks into a standardized container HTTP lifecycle.

That being said I was never clear on how other frameworks like Ruby managed to hook into the apache server. Is it via a C api? Some other standard?

Phusion Passenger's Design and Architecture document has a "Common Models" section that provides a great overview of the various ways this is done. — Jordan Running
– Jordan Running, Commented Jan 11, 2017 at 19:14

Community · Accepted Answer · 2020-06-20 09:12:55Z

There are many different ways to do that.

CGI

The oldest way is the Common Gateway Interface (CGI), introduced in 1993 in the NCSA webserver and supported by every webserver since. It is an extremely simple protocol: the webserver sets a standardized set of environment variables containing information about the request, calls a program, that program can then examine the environment variables and write its response to stdout, which the webserver simply sends back. The next request starts the cycle (and the program) all over again.

Here's an extremely simple CGI example in Ruby (I don't have Apache installed, so I couldn't test this):

#!/usr/bin/env ruby

# We need to send the header as well, the webserver won't do it for us
puts 'Content-type: text/html'
puts

puts <<~"HTML"
  <!DOCTYPE html>
  <html lang="en">
    <head>
      <meta charset="utf-8">
      <title>CGI Demo Script in Ruby</title>
    </head>
    <body>
      <p>
        My name is #{ENV['SCRIPT_NAME']} and you called me with the path #{ENV['PATH_INFO']} and query string #{ENV['QUERY_STRING']}.
     </p>
    </body>
  </html>
HTML

FastCGI

Since CGI starts a new process for every single request, and communication via environment variables and stdout isn't exactly efficient either, FastCGI was created in 1996. The idea is similar to CGI, but the handler process is not started for every request, instead the handler process stays in memory and handles the requests in a loop, and the communication between the webserver and the handler uses an efficient binary protocol.

Communication uses either Unix sockets or TCP.

SCGI

The Simple Common Gateway Interface (SCGI), published 2006, is similar to FastCGI, but simpler and uses a text-based communication protocol (based on netstrings) instead of a binary one. It also avoids the overhead of firing up a process for every request, instead the handler program keeps running and simply processes the requests in a loop.

In-process modules

Almost every webserver has a module API that allows you to add your own code to the running webserver process. One of the earliest ones was the Netscape NSAPI, which appeared in the mid-90s and was a direct competitor to CGI, with FastCGI OTOH being developed as a reaction to it. IIS, Apache, Nginx, etc. all also have similar APIs. These differ from FastCGI and SCGI mainly in the fact that they don't run as a separate process but rather within the webserver process. This gives them maximum performance and maximum access to the webserver internals, but also maximum potential for damage.

There are/were a couple of modules for running a Ruby execution engine inside a webserver. mod_ruby was an Apache module for embedding MRI inside Apache, mod_rubinius did the same thing for Rubinius. IronRuby could run inside IIS as a plugin.

This is the solution that is classically employed by PHP (mod_php) and Perl (mod_perl). There are also mod_python, even mod_mono and mod_java.

Webserver interfaces

Webserver interfaces provide a standardized interface between web frameworks and webservers. Actually, Servlets could be seen as a distant relative.

The most well-known interface in the Ruby world is Rack, but before there was Rack in Ruby, there were already WSGI (Web Server Gateway Interface) in Python, PSGI in Perl, and so on. The main idea behind such interfaces is that they provide a narrow, simple interface that decouples webframeworks and webservers. E.g. any Rack-compliant framework (Rails, Sinatra, Padrino, Cuba, Lotus, Ramaze, NYNY, Nancy, Crepe, Hobbit, Grape, Roy, …) will work with any Rack-compliant webserver (Mongrel, Thin, Unicorn, Puma, …) For example, before there was Rack, Rails had different code for running under Lighttpd, SCGI, FastCGI, CGI, and Webrick. Now, there's just Rack.

For webservers which are not themselves Rack-compliant (e.g. because they were created before Rack or because they are language-agnostic), there are ways to implement Rack on top of them using any of the above-mentioned techniques, e.g. there is a mod_rack for Apache and Nginx.

Phusion Passenger

Phusion Passenger is a special-case of this. It is a Rack (and Python WSGI) server implemented as an extension module to Apache and Nginx.

Native webserver

Instead of running the language inside the webserver, you can also run the webserver inside the language, e.g. Webrick is a webserver written in Ruby that can run Rails.

the Tin Man · Accepted Answer · 2017-01-11 22:02:49Z

1

Ruby cannot work with Apache alone, Apache is a web server and Ruby needs an application server as it's an interpreted language. Wikipedia's "Interpreted language" page has more about interpreted languages.

Ruby works with Apache in conjunction with an application server like Passenger. Read Digital Ocean's "How To Deploy a Rails App with Passenger and Apache on Ubuntu 14.04" tutorial for more information.

I have personally deployed Ruby apps with Passenger Phusion + Nginx as well.

edited Jan 11, 2017 at 22:02

the Tin Man

161k44 gold badges222 silver badges308 bronze badges

answered Jan 11, 2017 at 18:46

Karan Shah

1,32411 silver badges15 bronze badges

Comments

mwp · Accepted Answer · 2017-01-11 23:45:48Z

As with all things, there are several ways to do it. Phusion Passenger provides Apache and NGINX modules, which are relatively comparable to e.g. mod_php and mod_perl. It's also available in standalone mode, which includes its own web server (i.e. no Apache or NGINX required). A good comparison between the three modes is available here.

Tiered web architectures will take Passenger in standalone mode and serve it from behind a reverse proxy, which can be anything from NGINX or Apache to other middlewares like HAProxy or Varnish. You might be familiar with this model because it is similar to how Jetty, Tomcat, etc. serve Java applications to the public Internet (i.e. from behind a reverse proxy). Advanced architectures may consist of multiple reverse proxies and application servers, each solving a different piece of the multi-tiered web architecture puzzle (e.g. SSL termination, load balancing, caching, compression, static asset serving, dynamic content).

I actually prefer Puma, which is analogous to Passenger standalone. It's easier to understand and (last time I checked) faster and more memory-efficient in many benchmarks. It gives you the option of a hybrid forking+threading mode that (again, last time I checked) is only available in Passenger Enterprise. Puma is the default web server in Rails and on Heroku and other hosting platforms.

Collectives™ on Stack Overflow

How does the Ruby interpreter integrate with HTTP servers like Apache?

3 Answers 3

CGI

FastCGI

SCGI

In-process modules

Webserver interfaces

Phusion Passenger

Native webserver

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

CGI

FastCGI

SCGI

In-process modules

Webserver interfaces

Phusion Passenger

Native webserver

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related