Ticket #1005 (closed defect: fixed)

Opened 2 years ago

Last modified 21 months ago

delay starting threads

Reported by: rme Owned by: gb
Priority: normal Milestone:
Component: Runtime (threads, GC) Version: trunk
Keywords: Cc:

Description

When using a trunk darwinx8664 lisp at r15433 on Mountain Lion, slime often takes tens of seconds to start up.

At a quick first glance, I observe that allocate_tcr() often ends up looping hundreds of thousands, or even millions, of times before it gets a TCR that has a suitable address to use as a Mach port name.

Slime creates several threads at a startup, and usually a few of them end up taking a while to get going. Crudely instrumenting allocate_tcr() shows that the time to find a suitable port name can take anywhere from under a second to 20 seconds or more.

Change History

comment:1 Changed 2 years ago by gb

  • Owner changed from rme to gb
  • Status changed from new to assigned

comment:2 Changed 2 years ago by gb

(In [15437]) Stop trying to get by on our good looks and charm (or at least stop trying to assume that x8664 Darwin's malloc() will quickly return a 32-bit pointer in allocate_tcr(). See ticket:1005.) Instead, try to map a largish (1K) number of TCRs in free 32-bit memory and manage them explicitly on x8664 Darwin.

Note that this isn't thread-safe in general: we do this by walking our address space (via vm_region_64()) until we find a free 32-bit block of memory and using mmap() (with the MAP_FIXED option). When this happens at any time after application startup, it's possible for some foreign thread to be mapping/unmapping regions while we're doing this. (This is why OSes that provide mmap options that request 32-bit addresses do so in the kernel.) It's likely fairly hard in practice to exceed the 1K initial TCR allocation and it's not clear that this is any worse than the "wait until we get lucky with malloc()" strategy has been, but it may be better to just do the TCR allocation once on startup, avoid the (theoretical) thread-safety issues, and treat the (possibly raised) value of TCR_CLUSTER_COUNT as a hard limit.

lisp-kernel/platform-darwinx8664: define DARWIN64, to make conditionalization a little easier lisp-kernel/memory.c: implement darwin_allocate_tcr() and darwin_free_tcr() as outlined above lisp-kernel/thread_manager.c: allocate_tcr() uses darwin_allocate_tcr() on DARWIN64. Use darwin_free_tcr() instead of free() on DARWIN64. Make shutdown_thread_tcr() dequeue the TCR and put it back in the free TCR pool on DARWIN64.

comment:3 Changed 2 years ago by gb

r15437 tries to address the problem that rme identified (though I'm a little nervous about how attempts to create implausibly large numbers of threads are handled.)

When I tried to reproduce the problem, my .emacs file (copied and pasted from somewhere) set the inferior lisp program to (32-bit) "ccl". M-x slime hung once (in about a dozen tries); I don't know why it did so, but top didn't seem to show much CPU activity in the dx86cl process.

(In other words, there may be more than one bug here, though it wasn't even clear that what I saw was a problem in ccl and not Emacs or the OS.)

comment:4 Changed 2 years ago by gb

  • Status changed from assigned to new

r15439 avoids the thread-safety issues with r15437.

comment:5 Changed 21 months ago by gb

  • Status changed from new to closed
  • Resolution set to fixed

This was effectively fixed by the move away from Mach exception handling.

Note: See TracTickets for help on using tickets.