Re: OS Error.... again and again..

Fabian Thylmann (thylmann@m1.sprynet.com)
Tue, 23 Sep 1997 17:29:23 +0200

Message-Id: <3.0.3.32.19970923172923.00952a70@m1.sprynet.com>
Date: Tue, 23 Sep 1997 17:29:23 +0200
To: Stanley Gambarin <stanleyg@cs.bu.edu>
From: Fabian Thylmann <thylmann@m1.sprynet.com>
Subject: Re: OS Error.... again and again..
In-Reply-To: <Pine.GSO.3.96.970922215125.26078B-100000@csa>

>	- try your setup on another OS

Emm, that would be kinda hard. Since don't have any other OS available that
I could try all this on with the same setup of mysql and the like with the
database and all...

>	- provide AppClass lines from config file

I changed them around a bit but it doesn't really help.. the current one is:
AppClass /usr/local/etc/httpd/cgi-bin/trakker --processes 10
--listen-queue-depth 250

>	- provide sample fcgi script that can be used to 
>	reproduce the problem.

This is a C program, not a script. If you tell me where to send this to, I
might be able to send you the source.

>	- try different combinations of the developer kits/
>	apache module/OS.

I can not just switches OSs... kinda hard to do that..
But I tried various combinationys of apache mod and dev kits...
Currently, I am running the newest non-beta mod_fastcgi with apache 1.2.4
and the devkit I'm useing is 2.0b2.1.
I tried it with the devkit 1.5.2 but it didn't change anything.

>	- hook the server through the debugger and put the
>	breakpoint in the function where error message gets
>	printed.  Then run the server until you get an error,
>	and do a stack trace and send it to the list.

If you could explain to me how exactly I do this, I could try it.

>	Having a lot of TIME_WAIT open sockets maybe related to
>the web server, as most of the FastCGI processing/communication
>is done through Unix domain sockets (unless you are using -port)

Yes, that could be, since because of all these os errors, the webserver has
to send out loads of 500 Server Errors.

>option for the AppClass.  Also, make sure to read the following:
>http://www.apache.org/docs/misc/fin_wait_2.html

I will check this as soon as possible, right now, I can't get to apache.org


Also, at the moment it looks like the accept() bug is happening, ALTHOUGH I
am useing the devkits from fastcgi.idle.com which were said to have been
fixed.

Why I think that it is an accept problem is because of this.

This is an app useing MySQL database, via this command: mysqladmin proc
I can check the processes logged in the MySQL database currently. All 10
fastcgi processes are there all the time. It also tells me the current
query this processes is doing.

When I start the server.. after a few seconds, most of the fcgi processes
are in use (I have about 3-5 hits per second to this fcgi processes.). Now,
after a minute or so.. some start to die.. and won't do anything anymore..
mysqladmin proc tells me the processes is sleeping. This gets more and more
until not a single processes is doing anything and every hit to this
fastcgi app results in a Server App.

- Fabian Thylmann