[vox-tech] C question: Determining where a signal was raised

Peter Jay Salzman p at dirac.org
Fri Oct 22 22:44:44 PDT 2004


My apologies to clc readers; I'm aware this really isn't a C question.  ;-)



I have some code that traps floating point errors.  The signal is trapped
with this code:


      struct sigaction action;

      memset(&action, 0, sizeof(action));
      action.sa_sigaction = fpe_callback;    /* which callback function  */
      sigemptyset(&action.sa_mask);          /* other signals to block   */
      action.sa_flags = SA_SIGINFO;          /* give details to callback */

      if (sigaction(SIGFPE, &action, 0))
         die("Failed to register signal handler.");


and, when the signal is raised, the callback function is:


   void fpe_callback(int sig_number, siginfo_t *info, void *data)
   {
      data = data;      /* used for SIGIO (see F_SETSIG in fcntl) */

      if (sig_number != SIGFPE) {
         fprintf(stderr, "%s:%d %s error: "
            "recieved wrong signal number %d not %d\n",
            __FILE__, __LINE__, __FUNCTION__, sig_number, SIGFPE);
         exit(2);
      }

      fprintf(stderr, "%s:%d %s warn: ", __FILE__, __LINE__, __FUNCTION__);
      fpe_print_cause(stderr, info);

      exit(1);
   }


The function fpe_print_cause() does nothing more than print the cause of the
floating point error:


   void fpe_print_cause(FILE *file, siginfo_t *info)
   {
      if (info->si_signo != SIGFPE) {      // should never happen
         die("Somehow got a wrong signo = %d\n", info->si_signo);
      } else {
         fprintf(file,
            "FPE reason %d = \"%s\", from address 0x%X\n",
            info->si_code,
            info->si_code == FPE_INTDIV ? "integer divide by zero"     :
            info->si_code == FPE_INTOVF ? "integer overflow"           :
            info->si_code == FPE_FLTDIV ? "FP divide by zero"          :
            info->si_code == FPE_FLTOVF ? "FP overflow"                :
            info->si_code == FPE_FLTUND ? "FP underflow"               :
            info->si_code == FPE_FLTRES ? "FP inexact result"          :
            info->si_code == FPE_FLTINV ? "FP invalid operation"       :
            info->si_code == FPE_FLTSUB ? "subscript out of range"     :
            "unknown",
            (unsigned int) info->si_addr
         );
      }
   }



The *intent* of fpe_callback() is to print the function and line number that
was executing when the FPE signal was raised.  However, the function and line
number that gets printed is fpe_callback().  Useless information.

Is there a way to grab the function, file, and line number of the code that
was executing when the FPE signal was raised?

Running the executable under GDB is not an option because sometimes it can
take many, many hours for the FPE to raise.  Also, I thought I was being
crafty by replacing:

   exit(1);

with:

   abort();

A core file is generated, which should've given me details of where the code
was when the FPE was generated, but it looks like the stack blew chunks:

   p at lucifer$ gdb avataralt core 
   Using host libthread_db library "/lib/tls/libthread_db.so.1".
   Core was generated by `./avataralt'.
   Program terminated with signal 6, Aborted.

   warning: current_sos: Can't read pathname for load map: Input/output error

   Reading symbols from /lib/tls/libm.so.6...done.
   Loaded symbols for /lib/tls/libm.so.6
   Reading symbols from /lib/tls/libc.so.6...done.
   Loaded symbols for /lib/tls/libc.so.6
   Reading symbols from /lib/ld-linux.so.2...done.
   Loaded symbols for /lib/ld-linux.so.2
   #0  0x4006cee9 in raise () from /lib/tls/libc.so.6
   (gdb) bt
   #0  0x4006cee9 in raise () from /lib/tls/libc.so.6
   #1  0x4017aedc in ?? () from /lib/tls/libc.so.6
   #2  0x00003ffe in ?? ()
   #3  0x4006e781 in abort () from /lib/tls/libc.so.6
   #4  0x00000000 in ?? ()
      (snip)
   #46 0x40016c40 in ?? () from /lib/ld-linux.so.2
   #47 0x000000a3 in ?? ()
   #48 0x40016e78 in _r_debug ()
   #49 0xbfff8b74 in ?? ()
   #50 0x4000ba16 in _dl_map_object_deps () from /lib/ld-linux.so.2
   Previous frame inner to this frame (corrupt stack?)

To be honest, I don't have the slightest clue what happens to the stack
when an asynchronous signal handler executes or when a long jump happens.
I assume this is why GDB thinks the stack was corrupt...


Trace code will work, but I'm looking for something more elegant than
sprinkling trace code all over the place.  I'm so busy, the last thing I want
to do is start putting junk in my code that needs to be taken out.  If I'm
going to spend time on this, I at least want a return on my investment and
learn something I didn't know when I woke up this morning...   ;-)

Pete


More information about the vox-tech mailing list