Changeset 15194

Feb 5, 2012, 3:04:55 PM (9 years ago)

Try to address some of the problems exposed by some JVM's "signal
chaining" implementation (where the JVM installs signal handlers
that override the application's and calls those handlers when its
own can't deal with the exception; those calls don't observe things
like SA_ONSTACK or SA_RESTART, and this nonsense gets in the way
of some of CCL's stack-switching nonsense.)

We might instead say:

  • when we install our signal handlers, remember any "foreign" handler that might have been installed previously.
  • if a thread gets an exception signal while in foreign code and the signal had a foreign handler, call that foreign handler rather than dropping into the kernel debugger. (Yes, this'll have some of the same problems in reverse, but we might be able to address some of those problems to at least some degree.)
  • after a JVM is initialized, reestablish CCL's signal handlers.

I don't think that signal chaining can really work in general, so
this is mostly a question of whether CCL or the JVM has to live with
its limitations. (There might also be some performance issues.)

What's here is x86-specific. Android's Dalvik VM seems to only try
to override SIGBUS and that's less intrusive than the x86 situation;
I don't know what JVMs are available in fullblown ARM Linux or whether
any of them try to do signal chaining.

(We can avoid these issues completely for asynchronous signals used
to suspend/interrupt/kill threads if we're simply able to avoid using
conflicting signal numbers for that functionality. That isn't possible
for synchronous hardware exceptions.)

OpenJDK VMs may have options (such as -Xrs) that control whether the
JVM or "native code" handle things like SIGINT/SIGQUIT.

This has been tested (some) on x86 Linux. I'll try to watch the
automatic build system to see if I broke anything on other platforms.

1 edited


  • trunk/source/lisp-kernel/x86-exceptions.c

    r15164 r15194  
     53  We do all kinds of funky things to avoid handling a signal on the lisp
     54  stack.  One of those funky things involves using the  __builtin_return_address()
     55  intrinsic so that the real handler returns to the right place when it exits,
     56  even if it returns on a different stack.  Code at "the right place" is presumed
     57  to just do a sigrereturn, however the OS does that.
     59  Sadly, some JVMs (at least) do what they call "signal chaining": they install
     60  their own handlers for signals that we expect to handle, and call our handler
     61  when they realize that they don't know how to handle what we raise.  They
     62  don't observe sigaltstack, and they don't necessarily do call our handler
     63  tail-recursively (so our stack-switching code would cause our handler to
     64  return to the JVM's, running on the wrong stack.
     66  Try to work around this by setting up an "early" signal handler (before any
     67  of this JVM nonsense can take effect) and noting the address it'd return to.
     71real_sigreturn = (pc)0;
     73#define SIGRETURN_ADDRESS() (real_sigreturn ? real_sigreturn : __builtin_return_address(0))
     75#ifndef WINDOWS
     76#ifndef DARWIN
     78early_intn_handler(int signum, siginfo_t *info, ExceptionInformation *xp)
     80  real_sigreturn = (pc) __builtin_return_address(0);
     81  xpPC(xp) += 2;
     90  __asm volatile("int $0xcd");
     96#ifndef WINDOWS
     97#ifndef DARWIN
     98  struct sigaction action, oaction;
     100  action.sa_sigaction = (void *) early_intn_handler;
     101  sigfillset(&action.sa_mask);
     102  action.sa_flags = SA_SIGINFO;
     103  sigaction(SIGNUM_FOR_INTN_TRAP,&action,&oaction);
     105  do_intn();
     106  sigaction(SIGNUM_FOR_INTN_TRAP,&oaction, NULL);
    16681727  TCR* tcr = get_tcr(true);
     1728  Boolean do_stack_switch = false;
     1729  stack_t ss;
    16701731#if WORD_SIZE==64
    1678   handle_signal_on_foreign_stack(tcr,signal_handler,signum,info,context,(LispObj)__builtin_return_address(0)
    1679 );
     1739  /* Because of signal chaining - and the possibility that libaries
     1740     that use it ignore sigaltstack-related issues - we have to check
     1741     to see if we're actually on the altstack.
     1743     When OpenJDK VMs overwrite preinstalled signal handlers (that're
     1744     there for a reason ...), they're also casual about SA_RESTART.
     1745     We care about SA_RESTART (mostly) in the PROCESS-INTERRUPT case,
     1746     and whether a JVM steals the signal used for PROCESS-INTERRUPT
     1747     is platform-dependent.  On those platforms where the same signal
     1748     is used, we should strongly consider trying to use another one.
     1749  */
     1750  sigaltstack(NULL, &ss);
     1751  if (ss.ss_flags == SS_ONSTACK) {
     1752    do_stack_switch = true;
     1753  } else {
     1754    area *vs = tcr->vs_area;
     1755    BytePtr current_sp = (BytePtr) current_stack_pointer();
     1757    if ((current_sp >= vs->low) &&
     1758        (current_sp < vs->high)) {
     1759      do_stack_switch = true;
     1760    }
     1761  }
     1762  if (do_stack_switch) {
     1763    handle_signal_on_foreign_stack(tcr,signal_handler,signum,info,context,(LispObj)SIGRETURN_ADDRESS());
     1764  } else {
     1765    signal_handler(signum,info,context);
     1766  }
    17941881#else /* altstack works */
     1884   There aren't likely any JVM-related signal-chaining issues here, since
     1885   on platforms where that could be an issue we use either an RT signal
     1886   or an unused synchronous hardware signal to raise interrupts.
    17971889altstack_interrupt_handler (int signum, siginfo_t *info, ExceptionInformation *context)
    20982190#else /* altstack works */
    21002193altstack_suspend_resume_handler(int signum, siginfo_t *info, ExceptionInformation  *context)
     2329  x86_early_exception_init();
    22362330  install_pmcl_exception_handlers();
Note: See TracChangeset for help on using the changeset viewer.