DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

(guile.info.gz) Conservative GC

Info Catalog (guile.info.gz) General Rules (guile.info.gz) How Guile does it (guile.info.gz) Immediates vs Non-immediates
 
 18.2.2 Conservative Garbage Collection
 --------------------------------------
 
 Aside from the latent typing, the major source of constraints on a
 Scheme implementation's data representation is the garbage collector.
 The collector must be able to traverse every live object in the heap, to
 determine which objects are not live.
 
    There are many ways to implement this, but Guile uses an algorithm
 called "mark and sweep".  The collector scans the system's global
 variables and the local variables on the stack to determine which
 objects are immediately accessible by the C code.  It then scans those
 objects to find the objects they point to, et cetera.  The collector
 sets a "mark bit" on each object it finds, so each object is traversed
 only once.  This process is called "tracing".
 
    When the collector can find no unmarked objects pointed to by marked
 objects, it assumes that any objects that are still unmarked will never
 be used by the program (since there is no path of dereferences from any
 global or local variable that reaches them) and deallocates them.
 
    In the above paragraphs, we did not specify how the garbage collector
 finds the global and local variables; as usual, there are many different
 approaches.  Frequently, the programmer must maintain a list of pointers
 to all global variables that refer to the heap, and another list
 (adjusted upon entry to and exit from each function) of local variables,
 for the collector's benefit.
 
    The list of global variables is usually not too difficult to
 maintain, since global variables are relatively rare.  However, an
 explicitly maintained list of local variables (in the author's personal
 experience) is a nightmare to maintain.  Thus, Guile uses a technique
 called "conservative garbage collection", to make the local variable
 list unnecessary.
 
    The trick to conservative collection is to treat the stack as an
 ordinary range of memory, and assume that _every_ word on the stack is
 a pointer into the heap.  Thus, the collector marks all objects whose
 addresses appear anywhere in the stack, without knowing for sure how
 that word is meant to be interpreted.
 
    Obviously, such a system will occasionally retain objects that are
 actually garbage, and should be freed.  In practice, this is not a
 problem.  The alternative, an explicitly maintained list of local
 variable addresses, is effectively much less reliable, due to programmer
 error.
 
    To accommodate this technique, data must be represented so that the
 collector can accurately determine whether a given stack word is a
 pointer or not.  Guile does this as follows:
 
    * Every heap object has a two-word header, called a "cell".  Some
      objects, like pairs, fit entirely in a cell's two words; others may
      store pointers to additional memory in either of the words.  For
      example, strings and vectors store their length in the first word,
      and a pointer to their elements in the second.
 
    * Guile allocates whole arrays of cells at a time, called "heap
      segments".  These segments are always allocated so that the cells
      they contain fall on eight-byte boundaries, or whatever is
      appropriate for the machine's word size.  Guile keeps all cells in
      a heap segment initialized, whether or not they are currently in
      use.
 
    * Guile maintains a sorted table of heap segments.
 
    Thus, given any random word W fetched from the stack, Guile's
 garbage collector can consult the table to see if W falls within a
 known heap segment, and check W's alignment.  If both tests pass, the
 collector knows that W is a valid pointer to a cell, intentional or
 not, and proceeds to trace the cell.
 
    Note that heap segments do not contain all the data Guile uses; cells
 for objects like vectors and strings contain pointers to other memory
 areas.  However, since those pointers are internal, and not shared among
 many pieces of code, it is enough for the collector to find the cell,
 and then use the cell's type to find more pointers to trace.
 
Info Catalog (guile.info.gz) General Rules (guile.info.gz) How Guile does it (guile.info.gz) Immediates vs Non-immediates
automatically generated byinfo2html