Changes between Initial Version and Version 1 of Ticket #234

02/01/08 06:49:44 (9 years ago)

I added some formatting to your original message; I think that it makes the code easier to read.

If you're asking whether garbage collection of an image containing > 50MB (or ~ half a gigabyte) of list structure can be as fast - all other things being equal - as GC would be if that list wasn't live ... well, zero-cost GC would be a good thing. How would it work ?

The time that a GC takes is generally proportional to the amount of live (non-garbage) data in the heap. It's also affected by the structure of that data: the GC has to prove that each and every one of those 3,200,000 CONS cells is reachable (and, in general, that their CARs are, though the CARs in your example are just fixnums.) It is hard to know how to tell what's live data and what isn't without looking at every live object; you can maybe reduce the incremental cost of looking, but it's likely to hold true that this is still O(n).

If your mergesort algorithm works by making many copies of some or all of the source list (and these copies are all "live", reachable, non-garbage, relatively long-lived objects), then n increases and GC time increases along with it.

If the mergesort algorithm works by having each thread repeatedly cons relatively short-lived objects, then enabling the EGC (or rather, not disabling it) may improve locality as well as reduce GC times. I don't know how that algorithm works.

If you're saying that you want to cons a lot, have the GC collect what it can, and not have this cost anything ... good luck. I'll try to look at this, but in general I wouldn't be surprised if GC costs dominate execution times of things that allocate and retain lots of objects; if you can implement a parallel mergesort that doesn't allocate and retain lots of objects, its execution time shouldn't be dominated by GC time. If most of the consing you're doing in your algorithm is incidental, then it isn't worthwhile to do full GC (and once again prove that every cons cell in the original list is still live), and turning off the EGC wouldn't make much sense.

(None of this has anything to do with the cost of processing large numbers of weak object references, which is currently sometimes non-linear.)


  • Ticket #234

    • Property Status changed from new to assigned
  • Ticket #234 – Description

    initial v1  
    2525(include-book "finite-set-theory/osets/sets" :dir :system) 
    124124(ACL2::assign$ serial-result (time$ (SETS::mergesort-exec (@ int-list))))