[vox-tech] Crontab oddity - server timeout?
Brian Lavender
brian at brie.com
Thu Mar 26 20:10:56 PDT 2009
On Mon, Mar 23, 2009 at 03:22:06PM -0700, Bill Kendrick wrote:
>
> So I'm using lighttpd and fast_cgi, which occasionally has a problem where
> it gets 'stuck'. (Unable to bring fast_cgi back to life, even though
> resources are once again available.) Usually this results in Error 500s
> that never go away until lighttpd is restarted.
>
> So to avoid having to manually go in and resurrect the server, I created
> a shell script that tries to hit the site, checks for an HTTP 200 response,
> and if it doesn't see that, it does a 'tail' of the access and error logs
> (so that I can see what was happening at the time), and then invokes an
> "/etc/init.d/lighttpd restart" to kick the server.
>
> I've got the following crontab entry:
>
> */2 * * * * root THE_SCRIPT
>
> meaning it should run once every 2 minutes, all the time. I only get an
> email when I produces output, and it only does that if it fails to
> contact the webserver.
>
> However, when it does fail, I get numerous reports at once. Could this
> be because the server isn't responding immediately when I check the status?
>
> I'm doing that via, in the shell script:
>
> STATUS=`wget --save-headers http://www.MYSITE.com/ -O - 2> /dev/null | head -1 | cut -d " " -f 2`
>
> In other words, hit the site, save the headers, save them out to stdout,
> chop off the "HTTP/1.1" to get the delicious "200" (hopefully) status.
>
>
> I guess maybe I need to give it a "--timeout" argument, and something
> less than 120 seconds, so that the jobs don't run over each other...?
If the server is running, and accepts a connection, but not report back
a 200, then I would imagine it will hang on. Is it accepting a socket
connection, but not reporting back? What if you put a lock file in your
script, so that it exits if another one is already running?
20.9.1 Locking a mailbox file
http://rute.2038bug.com/node23.html.gz#SECTION002390000000000000000
Have you thought about using NAGIOS? It's tricky to configure,
but there is a NAGIOS book that is available through the
http://safari.oreilly.com. I believe it should have an area where you
can configure it to take action if the service is down.
Nagios, 2nd Edition
by Wolfgang Barth
Publisher: No Starch Press
Pub Date: October 28, 2008
Print ISBN-13: 978-1-593-27179-4
Pages: 720
There is also the Linux Networking Cookbook. It has some fast easy
methods for monitoring your httpd service.
Linux Networking Cookbook
by Carla Schroder
Publisher: O'Reilly Media, Inc.
Pub Date: November 26, 2007
Print ISBN-10: 0-596-10248-8
Print ISBN-13: 978-0-596-10248-7
Pages: 456
It has a NAGIOS section. It is also available through the safari site. I
imagine you might also have some different sources as well. ;-)
Or, you could write your own socket using select. Create you socket file
descriptor and pass it to the following.
http://www.gnu.org/software/hello/manual/libc/Waiting-for-I_002fO.html
#include <errno.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/time.h>
int
input_timeout (int filedes, unsigned int seconds)
{
fd_set set;
struct timeval timeout;
/* Initialize the file descriptor set. */
FD_ZERO (&set);
FD_SET (filedes, &set);
/* Initialize the timeout data structure. */
timeout.tv_sec = seconds;
timeout.tv_usec = 0;
/* select returns 0 if timeout, 1 if input available, -1 if error. */
return TEMP_FAILURE_RETRY (select (FD_SETSIZE,
&set, NULL, NULL,
&timeout));
}
int
main (void)
{
fprintf (stderr, "select returned %d.\n",
input_timeout (STDIN_FILENO, 5));
return 0;
}
brian
--
Brian Lavender
http://www.brie.com/brian/
More information about the vox-tech
mailing list