Re: recently-asked FastCGI questions

Mark Brown (mbrown@OpenMarket.com)
Fri, 14 Jun 1996 09:09:25 -0400

Message-Id: <199606141309.JAA08204@breckenridge.openmarket.com>
To: fastcgi-developers@OpenMarket.com
Subject: Re: recently-asked FastCGI questions 
In-Reply-To: Paul Mahoney's message of "Fri, 14 Jun 1996 06:56:12."
             <Pine.SCO.3.90.960614064947.14598C-100000@xact4.xact.com> 
Date: Fri, 14 Jun 1996 09:09:25 -0400
From: Mark Brown <mbrown@OpenMarket.com>


Paul Mahoney asks:

    > At any instant of time the Apache server runs one request per process.
    > Therefore it has no opportunity to perform multiplexing of
    > connections.  (This will change if and when Robert Thau's
    > multi-threaded Apache core becomes a mainstream part of Apache.)
    Is this available now? If not, does Robert know when he might have
    something available.

I've only been following the message traffic, haven't looked at the
server.  I suspect that Robert has converted the minimum of modules
outside the core.

    > The lack of connection multiplexing does not mean that FastCGI
    > applications running on the Apache server cannot benefit from
    > concurrent request handling.  It just means that instead of getting
    > concurrent requests over a single connection from the Web server, the
    > FastCGI applications must accept multiple connections, from different
    > processes of the Web server.  This is less efficient than multiplexing
    > a single connection, but more efficient than running separate
    > application processes to get concurrency.

    Could you expand on this... I don't follow what would be happening here.
    Can you set up Apache and the FCGI client to have multiple sockets between
    them? For my case I must only have the 1 FCGI client.

The current libraries only allow a single connection at a time.  In the
Apache case, that would be from one of the Apache processes to the FastCGI
application process (which is really a server, not a client, from the
FastCGI point of view.)

If we had a FastCGI library organized in an event-driven style around
select (which we don't have today), then it would be a very small
matter to select on the listening socket as well as the current FastCGI
connection socket and whatever other FDs were active.  When select showed
the listening socket readable, the event loop would dispatch to a handler
that would open a new connection.  In this way a single FastCGI application
process would end up connected to multiple Apache server processes, and
the FastCGI application would use the select-based event loop to multiplex
itself between these connections.  The Apache server processes would really
be none the wiser that they were being served concurrently by a single
process, since the Apache server processes don't communicate among themselves.

With the application organized around select certain niceties go
out the window.  The thing folks are most likely to miss is the
output stream abstraction.  You can't call printf from a select-based
program because the underlying stream might need to do a write
and the write might block and then what do you do with the call stack?
If you can buffer the application's entire output in memory
then no problem.  Same thing on the input side: if you can do all
the reading up front, then give the application a pointer to the
data in memory, then you bound the event-driven complexity -- at least
for FastCGI.  But the whole reason for introducing concurrency in
the application is that you need to wait for something, like a database
lookup, so *somewhere* in the application you do need to save state
in a record and give control to the event loop.  A lot of folks
have trouble getting this right.

The other way to get the same effect would be to use POSIX threads in the
FastCGI application.  In this case one would probably dedicate a thread
in the FastCGI application to accepting new connections on the listening
socket.  Once the connection was accepted, it would be initialized
using exactly the same code as FCGX_Accept currently uses for connection
establishment.  The result would be a char **envp and three FCGX_Stream *:
in, out, err.  These would either be passed to a new thread forked for
the purpose of handling the connection, or would be placed on a queue
to be dealt with by a pool of worker threads.  The latter would be
good for a server-push type application in which the connection stays
open for a long time but is only used infrequently.

There's no reason in principle that somebody can't write a POSIX threads
emulator on top of select.  (For all I know, Robert's threads package
may look like this.)  That would be a nicer way to get the portability
in my view.

In summary:

    The FastCGI protocol/specification allows applications to accept
    concurrent connections from the Web server.

    The servers are all capable of making concurrent connections today.

    The application libraries are *not* capable of accepting
    concurrent connections today.

    There are two possible ways of providing the concurrency at the
    FastCGI application level:

        using select is more portable but requires deeper surgery to the
        existing application library and provides a conceptually more
        challenging environment for writing application programs.

        using threads is less portable but requires only superficial
        changes to the current application library and provides
        a more familiar set of application programming abstractions
        (e.g. input and output streams).

    --mark