(guile.info.gz) Conservative GC
Info Catalog
(guile.info.gz) General Rules
(guile.info.gz) How Guile does it
(guile.info.gz) Immediates vs Non-immediates
18.2.2 Conservative Garbage Collection
--------------------------------------
Aside from the latent typing, the major source of constraints on a
Scheme implementation's data representation is the garbage collector.
The collector must be able to traverse every live object in the heap, to
determine which objects are not live.
There are many ways to implement this, but Guile uses an algorithm
called "mark and sweep". The collector scans the system's global
variables and the local variables on the stack to determine which
objects are immediately accessible by the C code. It then scans those
objects to find the objects they point to, et cetera. The collector
sets a "mark bit" on each object it finds, so each object is traversed
only once. This process is called "tracing".
When the collector can find no unmarked objects pointed to by marked
objects, it assumes that any objects that are still unmarked will never
be used by the program (since there is no path of dereferences from any
global or local variable that reaches them) and deallocates them.
In the above paragraphs, we did not specify how the garbage collector
finds the global and local variables; as usual, there are many different
approaches. Frequently, the programmer must maintain a list of pointers
to all global variables that refer to the heap, and another list
(adjusted upon entry to and exit from each function) of local variables,
for the collector's benefit.
The list of global variables is usually not too difficult to
maintain, since global variables are relatively rare. However, an
explicitly maintained list of local variables (in the author's personal
experience) is a nightmare to maintain. Thus, Guile uses a technique
called "conservative garbage collection", to make the local variable
list unnecessary.
The trick to conservative collection is to treat the stack as an
ordinary range of memory, and assume that _every_ word on the stack is
a pointer into the heap. Thus, the collector marks all objects whose
addresses appear anywhere in the stack, without knowing for sure how
that word is meant to be interpreted.
Obviously, such a system will occasionally retain objects that are
actually garbage, and should be freed. In practice, this is not a
problem. The alternative, an explicitly maintained list of local
variable addresses, is effectively much less reliable, due to programmer
error.
To accommodate this technique, data must be represented so that the
collector can accurately determine whether a given stack word is a
pointer or not. Guile does this as follows:
* Every heap object has a two-word header, called a "cell". Some
objects, like pairs, fit entirely in a cell's two words; others may
store pointers to additional memory in either of the words. For
example, strings and vectors store their length in the first word,
and a pointer to their elements in the second.
* Guile allocates whole arrays of cells at a time, called "heap
segments". These segments are always allocated so that the cells
they contain fall on eight-byte boundaries, or whatever is
appropriate for the machine's word size. Guile keeps all cells in
a heap segment initialized, whether or not they are currently in
use.
* Guile maintains a sorted table of heap segments.
Thus, given any random word W fetched from the stack, Guile's
garbage collector can consult the table to see if W falls within a
known heap segment, and check W's alignment. If both tests pass, the
collector knows that W is a valid pointer to a cell, intentional or
not, and proceeds to trace the cell.
Note that heap segments do not contain all the data Guile uses; cells
for objects like vectors and strings contain pointers to other memory
areas. However, since those pointers are internal, and not shared among
many pieces of code, it is enough for the collector to find the cell,
and then use the cell's type to find more pointers to trace.
Info Catalog
(guile.info.gz) General Rules
(guile.info.gz) How Guile does it
(guile.info.gz) Immediates vs Non-immediates
automatically generated byinfo2html