Changeset 15534


Ignore:
Timestamp:
Dec 11, 2012, 6:10:59 PM (7 years ago)
Author:
rme
Message:

Translate the rest of the GC chapter. It could now use an update,
both for content and style.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/source/doc/tex/gc.tex

    r15520 r15534  
    11\chapter{The Garbage Collector}
     2
     3\section{Heap Space Allocation}
     4
     5\CCL\ manages a contiguous region of memory.  It maps a large block of
     6address space at startup, and then uses (and returns) memory from that
     7area as the size of the live lisp heap data grows and shrinks.  After
     8the initial heap image loads and after each full GC, the lisp kernel
     9will try to ensure that a specified amount (the
     10``lisp-heap-gc-threshold'') of free memory is available. The initial
     11value of this kernel variable is 16MB on 32-bit implementations and
     1232MB on 64-bit implementations; it can be manipulated from Lisp (see
     13below.)
     14
     15The large reserved memory block consumes very little in the way of
     16system resources; memory that's actually committed to the lisp heap
     17(live data and the ``threshold'' area where allocation takes place)
     18consumes finite resources (physical memory and swap space). The lisp's
     19consumption of those resources is proportional to its actual memory
     20usage, which is generally a good thing.
     21
     22The \cd{-R} or \cd{-{}-heap-reserve} command-line option can be used to
     23limit the size of the reserved block and therefore bound heap
     24expansion.
    225
    326\section{Ephemeral GC}
     
    123146\end{defun}
    124147
     148\section{Page Reclamation Policy}
     149
     150After a full GC finishes, it'll try to ensure that at least
     151\cd{(lisp-heap-gc-threshold)} of virtual memory are available; objects will
     152be allocated in this block of memory until it fills up, the GC is
     153triggered, and the process repeats itself.
     154
     155Many programs reach near stasis in terms of the amount of logical
     156memory that's in use after full GC (or run for long periods of time in
     157a nearly static state), so the logical address range used for consing
     158after the $n$th full GC is likely to be nearly or entirely identical to
     159the address range used by the $n+1$th full GC.
     160
     161By default (and traditionally in CCL), the GC's policy is to ``release''
     162the pages in this address range: to advise the virtual memory system
     163that the pages contain garbage and any physical pages associated with
     164them don't need to be swapped out to disk before being reused and to
     165(re-)map the logical address range so that the pages will be
     166zero-filled by the virtual memory system when they're next
     167accessed. This policy is intended to reduce the load on the VM system
     168and keep CCL's working set to a minimum.
     169
     170For some programs (especially those that cons at a very high rate),
     171the default policy may be less than ideal: releasing pages that are
     172going to be needed almost immediately---and zero-fill-faulting them
     173back in, lazily---incurs unnecessary overhead. (There's a false
     174economy associated with minimizing the size of the working set if it's
     175just going to shoot back up again until the next GC.) A policy of
     176``retaining'' pages between GCs might work better in such an
     177environment.
     178
     179Functions described below give the user some control over this
     180behavior. An adaptive, feedback-mediated approach might yield a better
     181solution.
     182
     183\begin{defun}[Function]
     184lisp-heap-gc-threshold
     185
     186Returns the value of the kernel variable that specifies the amount of
     187free space to leave in the heap after full gc.
     188\end{defun}
     189
     190\begin{defun}[Function]
     191set-lisp-heap-gc-threshold new-threshold
     192
     193Sets the value of the kernel variable that specifies the amount of
     194free space to leave in the heap after full GC to {\it new-threshold},
     195which should be a non-negative fixnum. Returns the value of that
     196kernel variable (which may be somewhat larger than what was
     197specified).
     198\end{defun}
     199
     200\begin{defun}[Function]
     201use-lisp-heap-gc-threshold
     202
     203Tries to grow or shrink lisp's heap space, so that the free space is
     204(approximately) equal to the current heap threshold. Returns \nil.
     205\end{defun}
     206
     207\begin{defun}[Function]
     208gc-retain-pages flag
     209
     210Tries to influence the GC to retain/recycle the pages allocated
     211between GCs if {\it flag} is true, and to release them otherwise. This is
     212generally a tradeoff between paging and other VM considerations.
     213\end{defun}
     214
     215\begin{defun}[Function]
     216gc-retaining-pages
     217
     218Returns \cd{t} if the GC tries to retain pages between full GCs and \nil\ if
     219it's trying to release them to improve VM paging performance.
     220\end{defun}
     221
     222
     223\section{Pure Areas}
     224
     225The function \fn{save-application} identifies code vectors and the
     226pnames of interned symbols and copies these objects to a "pure" area
     227of the image file it creates. (The "pure" area accounts for most of
     228what the \cd{room} function reports as "static" space.)
     229
     230When the resulting image file is loaded, the pure area of the file is
     231memory-mapped with read-only access. Code and pure data are paged in
     232from the image file as needed (and don't compete for global virtual
     233memory resources with other memory areas.)
     234
     235Code-vectors and interned symbol pnames are immutable: it is an error
     236to try to change the contents of such an object. Previously, that
     237error would have manifested itself in some random way. In the new
     238scheme, it'll manifest itself as an "unhandled exception" error in the
     239Lisp kernel. The kernel could probably be made to detect a spurious,
     240accidental write to read-only space and signal a lisp error in that
     241case, but it doesn't yet do so.
     242
     243The image file should be opened and/or mapped in some mode which
     244disallows writing to the memory-mapped regions of the file from other
     245processes. I'm not sure of how to do that; writing to the file when
     246it's mapped by CCL can have unpredictable and unpleasant
     247results. \cd{save-application} will delete its output file's directory
     248entry and create a new file; one may need to exercise care when using
     249file system utilities (like tar, for instance) that might overwrite an
     250existing image file.
     251
     252
     253\section{Weak References}
     254
     255In general, a ``weak reference'' is a reference to an object which
     256does not prevent the object from being garbage-collected. For example,
     257suppose that you want to keep a list of all the objects of a certain
     258type. If you don't take special steps, the fact that you have a list
     259of them will mean that the objects are always live, because you can
     260always reference them through the list. Therefore, they will never be
     261garbage-collected, and their memory will never be reclaimed, even if
     262they are referenced nowhere else in the program. If you don't want
     263this behavior, you need weak references.
     264
     265CCL supports weak references with two kinds of objects: weak hash
     266tables and populations.
     267
     268Weak hash tables are created with the standard Common Lisp function
     269\cd{make-hash-table}, which is extended to accept the keyword argument
     270\cd{:weak}. Hash tables may be weak with respect to either their keys
     271or their values. To make a hash table with weak keys, invoke
     272make-hash-table with the option \cd{:weak t}, or, equivalently,
     273\cd{:weak :key}. To make one with weak values, use \cd{:weak :value}.
     274When the key is weak, the equality test must be \cd{eq}
     275(because it wouldn't make sense otherwise).
     276
     277When garbage-collection occurs, key-value pairs are removed from the
     278hash table if there are no non-weak references to the weak element of
     279the pair (key or value).
     280
     281In general, weak-key hash tables are useful when you want to use the
     282hash to store some extra information about the objects you look up in
     283it, while weak-value hash tables are useful when you want to use the
     284hash as an index for looking up objects.
     285
     286A population encapsulates an object, causing certain reference from
     287the object to be considered weak. CCL supports two kinds of
     288populations: lists, in which case the encapsulated object is a list of
     289elements, which are spliced out of the list when there are no non-weak
     290references to the element; and alists, in which case the encapsulated
     291object is a list of conses which are spliced out of the list if there
     292are no non-weak references to the car of the cons.
     293
     294If you are experimenting with weak references interactively, remember
     295that the values of the last three expressions are referenced by the
     296variables \cd{*}, \cd{**}, and \cd{***}. The easy workaround is to
     297evaluate a few meaningless expressions in order to get the object out
     298of the REPL variables before invoking the GC.
     299
     300\begin{defun}[Function]
     301population &key :type :initial-contents
     302
     303The value of the \cd{:type} keyword argument must be either \cd{:list}
     304(the default), or \cd{:alist}.  The \cd{:initial-contents} argument
     305should be a sequence of elements (or conses, for type \cd{:alist})
     306which will be used to initialize the population.  The sequence itself
     307(and the conses in case of an alist) is not stored in the population;
     308a new list or alist is created to hold the elements.
     309
     310The new population is returned.
     311\end{defun}
     312
     313\begin{defun}[Function]
     314population-type population
     315
     316This function returns the type of {\it population}, either
     317\cd{:list} or \cd{:alist}.
     318\end{defun}
     319
     320\begin{defun}[Function]
     321population-contents population
     322
     323This function returns the list encapsulated in {\it population}. Note that
     324as long as there is a direct (non-weak) reference to this list, it
     325will not be modified by the garbage collector. Therefore it is safe to
     326traverse the list, and even modify it, no different from any other
     327list. If you want the elements to become garbage-collectable again,
     328you must stop refering to the list directly.
     329
     330This function may be used with \cd{setf} to update the contents of
     331{\it population}.  The new list is copied;  it is not used directly.
     332\end{defun}
     333
     334
     335
     336
Note: See TracChangeset for help on using the changeset viewer.