Some notes on Darwin cooperative threads
It's obviously possible (see !OpenMCL < 0.14) to implement user-space cooperative threads without using OS facilities, and some OSes provide their own cooperative threading facilities (e.g., "fibers" on Windows.) This page discusses issues that're specific to the traditional MacOS cooperative threads provided in OSX and with issues related to using native and cooperative threads in the same process.
Cooperative threads on OSX are pthreads that have certain data associated with them and which ordinarily execute specific startup and and shutdown functions. In many respects, they seem to behave just like native threads: they can receive asynchronous signals and run their handlers for those signals even if they're not "runnable". (In other words, the illusion of cooperative scheduling is maintained by the threads library, not by the OS; they can run concurrently with preemptively scheduled threads, though it's usually the case that only one cooperative thread can be in a non-blocked state at any time.
Private, per-thread data structure
The traditional MacOS threads library associates an internal data structure with each OS-level thread that it's aware of. (It's "aware of" threads created by NewThread() and is aware of some distinguished native thread that first calls a function in the Carbon threads library; I don't know exactly which functions are involved. It's not generally aware of threads created via pthread_create() or via lower-level Mach mechanisms.) One of the fields in this data structure is a Mach message port that's allocated on cooperative thread creation; internal functions seem to refer to this port as the thread's "scheduling token."
An unexported function - !FindThreadByID - maps a thread ID (which is equivalent to a pthread identifier) to the associated data structure. The first time that it's called (and this can only happen if the first caller is a native thread - it checks the value of the unexported value gMainThread. If gMainThread is null, a data structure is allocated for that native thread, gMainThread is set to that data structure, and !FindThreadByID returns that data structure. Thread data structures are maintained in a circular linked list, and in general !FindThreadByID walks this list until it finds a data structure with the right thread ID.
YieldToThread and YieldToAnyThread will return a thread protocol error unless both the calling thread and the target have private thread-library data structures associated with them (e.g., each thread must be either a cooperative thread or gMainThread.) Given the current and target data structures (which contain the "scheduling token" Mach message ports), YieldToThread sends a message to the target thread's scheduling token port and waits for an incoming message on its own port. This is done atomically via the mach_msg() library function, which is a wrapper around the mach_msg_trap() syscall. The trap layer can detect and report interrupted system calls, but the mach_msg() library function automatically retries them. (This has implications for PROCESS-INTERRUPT: an interrupt that occurs when the target thread is executing foreign code sets a pending interrupt bit in the TCR; returning from a foreign function call checks this bit and arranges to execute the interrupt function at that point, if interrupts are enabled. A pending interrupt for a non-current cooperative thread won't execute until that thread receives a reply in the msg_send() call invoked by YieldToThread.) The same problem - stemming from the fact that msg_send() restarts interrupted system calls - limits the ability of PROCESS-INTERRUPT to reliably interrupt a thread that's waiting for low-level events messages, since msg_send() is used in that context as well.)
Obviously, the first successful call to YieldToThread when the target is a cooperative thread must be made from gMainThread; calling YieldToThread from gMainThread has the effect of making gMainThread behave as a (pseudo-)cooperative thread, e,g., it'll block in msg_send(), waiting for an incoming message.
It seems to be the case that cooperative threads can share the message port used to receive event messages from the window server.
Unless the active cooperative thread is blocked in mach_msg() (due to event processing or due to the fact that it's about to become inactive), it should be possible to use PROCESS-INTERRUPT from a native thread to force it to cause some other cooperative thread to be scheduled. A native thread can effectively schedule the cooperative threads by doing something like:
(loop (sleep time-slice) ; maybe something like .1 second (let* ((current-cooperative-thread (reliably-determine-current-cooperative-thread)) (when current-cooperative-thread (process-interrupt current-cooperative-thread (lambda () (yield-to-any-thread))))))
Obviously, some details are tbd; it seems that the hard part here is RELIABLY-DETERMINE-CURRENT-COOPERATIVE-THREAD, but I think that a scheme involving simple locking would work (if we can't come up with anything better.)
YIELD-TO-ANY-THREAD (aka PROCESS-ALLOW-SCHEDULE, specialized to instances of a cooperative thread class) would have to maintain the global "current cooperative-thread" state, something like:
(setf (current-cooperative-thread) nil) (#_YieldToAnyThread) (setf (current-cooperative-thread) *current-process*)
Cooperative thread creation
The primary way in which a thread gets created - the function xNewThread in the lisp kernel - waits for the newly-created preemptive thread to run (at least enough to create and initialize a TCR and signal a semaphore indicating that it's reached the point of having reset itself.) Some handshaking then takes place between the creating thread (in the case of PROCESS-RUN-FUNCTION) or whatever thread presets and enables the new thread. The effect is that the new thread calls (via start_lisp()) its initial function (which is placed on its stack by PROCESS-PRESET) and may return from that initial function and wait to be preset again. It may be difficult to use that scheme in the case of cooperative threads, since it may be difficult for the creating thread to ensure that the newly-created thread is runnable during its initialization.
Another approach to thread creation involves existing support for "foreign threads" (threads that are created by foreign code but callback to lisp and therefore need a TCR, lisp stacks, exception support, etc.) The general idea (devil in the details) involves:
1. Splitting the current thread startup code into two parts, one of which handles TCR initialization (pre-reset) and the other of which handles calling the initial function, possibly repeatedly. At least some entry
2. Use NewThread to create a new cooperative thread, specifying the second part of the lisp startup code as the initial function. Create the thread in a "suspended" state (so that YieldToAnyThread won't try to cooperatively schedule it.)
3. Holding the exception lock, send the new thread a suspend (SIGUSR2) signal. Ensure that the suspend handler calls get_tcr(true) to create a TCR for the cooperative thread (it doesn't, at the moment, since a thread can currently only be suspended if it owns the TCR.) The thread will enter the suspend handler (with all signals masked) and will create a TCR, pushing some special binding info on the lisp stack.
4. Find the newly-created TCR and set its suspend count to 1; wait on the new TCR's "suspend" semaphore. (In general, we may have to busy-wait until we're sure that the new TCR has entered its suspend handler, unless we can find a way to initialize its suspend semaphore.) Once the thread looks like it's suspended - with a suspend count of 1 - we can release the exception lock. The thread will look like it was suspended in foreign code (in a call to msg_send(), most likely.)
5. Discard the special-binding stuff and anything else that's been pushed on its stacks. Clear the "foreign thread" bit in its TCR; maybe set a "cooperative thread" bit. The lisp process object might want to have its class changed (to COOPERATIVE-PROCESS), so that we can specialize methods on it. Give the thread an initial function, e.g., preset it.
6. Set its thread-manager state to "active". Call resume_tcr() on its TCR; it should reenter and restart the interrupted msg_send().
7. The cooperative scheduler should be able to schedule it.
As soon as possible after Carbon libraries are initialized, a native thread should be started to run the cooperative scheduler loop.
Soon after that, a thread should be created that will become "gMainThread", e.g., a native thread that gets cooperatively scheduled. If it's the current cooperative thread while it's running, the scheduler thread will cause it to yield to itself periodically. It doesn't need to do anything while Carbon code is loading, but can serve as the primary event-processing thread.
It's important that some thread that we're willing to allow to "turn cooperative" (become gMainThread) is created before some other operation (connecting to the window/font server) happens in the listener or the (native) initial thread during Carbon loading and application startup.