Re: Linux, Apache 1.2b4 latest fastCGI unix socket problem

Stanley Gambarin (gambarin@OpenMarket.com)
Wed, 15 Jan 1997 10:28:51 -0500

Message-Id: <199701151528.KAA06624@u4-138.openmarket.com>
To: anthonyr@ce.com.au (Anthony Rumble)
Subject: Re: Linux, Apache 1.2b4 latest fastCGI unix socket problem 
In-Reply-To: Your message of "Tue, 14 Jan 1997 11:20:16 +1100."
             <m0vkL1x-0002VzC@enterprise.ce.com.au> 
Date: Wed, 15 Jan 1997 10:28:51 -0500
From: Stanley Gambarin <gambarin@OpenMarket.com>

> 
> I have a problem with the FastCGI routines in Apache...
> 
> Running on Linux 2.0.26
> 
> Using Apache 1.2b4 (also happened on 1.2b1,2,3)
> I've tried the mod_fastcgi.c that comes with Apache,
> and also the latest 1.4.2 one from http://www.fastcgi.com/servers/apache/1.4.2/apache-fastcgi.tar.Z
> 
> Im Running Perl 5.003_16 using sfio and FCGI module.
> 
> The server starts, and the FastCGI apps run up ok and run fine..
> 
> After a period of heavy usage, the clients one by one stop
> responding.. doing a ps -lxaw the FCGI apps are stuck in the
> 
> "unix_data_wait". They are normally at the "unix_accept" state..
> Once in data wait, they never respond again. I've tried 
> killing them off , the apache server restarts them, but they are still
> stuck in the unix_data_wait.. Only a server reload will reset them
> back to a working state again.
> 
> Does anyone have any ideas? This is very frustrating.. as I had to restart
> the server every few hours to fix it..
> 
> Oh.. I get this error in the error_log
> 
> [Tue Jan 14 10:45:12 1997] access to /home/httpd/html/matilda/jumpto.fcg failed 
> for 202.0.90.217, reason: mod_fastcgi: Could not connect to application, OS erro
> r 'Interrupted system call'
> [Tue Jan 14 10:45:15 1997] read script input or send script output timed out for
>  203.1.75.154
> [Tue Jan 14 10:45:15 1997] access to /home/httpd/html/matilda/jumpto.fcg failed 
> for 203.1.75.154, reason: mod_fastcgi: Could not connect to application, OS erro
> r 'Interrupted system call'
> [Tue Jan 14 10:45:15 1997] read: Broken pipe
> [Tue Jan 14 10:45:15 1997] - lingering_close
> 
> Could this be the FastCGI application?
> 
> If so.. why doesn't killing it off fix it?
> 
> -- 
> Anthony Rumble - Online Ordering Systems
> Corporate Express Australia Limited
> Phone 02-9335-0669 Fax 02-9335-0753 Mobile 015-955-042 Pager 016-634-997
> 

	You are right in your assumption that this is not a FastCGI 
problem.  In the matter of fact, this problem is attributed to Apache
web server that had appeared after release 1.0 of Apache.  The problem
is due to attempt of Apache to work around some buggy implementation
of the TCP stacks.  The last entry in your error log indicates a call to
lingering_close(), which is where the problem occurs.  As far as my 
understanding goes, the used sockets are not closed properly, resulting
in the FIN_WAIT2 state, which means that you eventually run out of sockets.
The suggestion made by the Apache group is to enable NO_LINGCLOSE option
and recompile your server.  This alleviates the above problem in 
*** some, but NOT all *** operating systems running Apache.  If the
problem still persists, I would suggest contacting an Apache group and
submitting bug report with the above information.
							Stanley.