Ticket #715 (closed defect: fixed)
Foreign exception issues
|Reported by:||gb||Owned by:|
|Component:||Runtime (threads, GC)||Version:||trunk|
Historically, CCL has treated an exception that occurs in foreign code as being fatal; we don't in general know what foreign state may need to be unwound or whether the code that got the exception is reentrant, so the absolute best that we could do is a sort of "cross your fingers, pray, and signal a lisp error." Whether that's worth a try or not is a separate issue.
Relatively recent changes to the trunk allow us to note when SIGFPE is raised during execution of foreign code (at least on x8664); this is a good thing, in that it removes a little bit of overhead from every ff-call. This change exposes a subtle and long-standing bug.
When a thread gets an exception on Unix platforms, it stores the exception context in a TCR field, unmasks blocked signals, and waits for the exception lock. That makes sense if the exception occured during the execution of lisp code: if some other thread GCs while the thread in question is waiting, the GC thread will see that thread's pending exception context. If the exception occurs in foreign code, the GC thread should not see the pending exception context. (As I said, this is a longstanding bug; the SIGFPE handling just makes it theoretically more likely to occur.)
On Win64, a thread can be suspended or interrupted while in the process of returning from an exception and restoring its valence. We've assumed that a thread can only return from an exception if the exception occurred during execution of lisp code, so when pc-lusering our way out of exception return on Win64 we've assumed that we'll be resuming in lisp state; the SIGFPE handing in foreign code means that that assumption isn't valid, and we'll need to handle this more carefully.
The likelyhood of bad things happening is small (but non-zero.)