Oct 18 1999.

Visualwork reclamation facilities


Taken from:

ObjectMemory(class)>>reclamationFacilities

The virtual machine (VM) has several facilities for reclaiming the space occupied by objects that are no longer accessible from the system roots: a generation scavenger, an incremental garbage collector, a compacting garbage collector, a global compacting garbage collector, and a data compactor. Except for the scavenger, the VM does not invoke these facilities directly, leaving the policy decision of when to run them up to the CurrentMemoryPolicy. Even the operation of the scavenger can be controlled by manipulating the various thresholds associated with it.

Generation Scavenger

The primary reclamation system is a generation scavenger. The scavenger reclaims those objects that expire while residing in NewSpace. It has been shown in several studies that scavenging is a particularly efficient way of collecting garbage in an object population whose members are more likely to die than survive. This is because a scavenger spends most of its time copying survivors, who will be few in number in such populations, rather than tracing corpses, who will be many in number in such populations. This fact makes scavenging an especially appropriate technique for use in NewSpace, since studies have shown that most newly created objects (i.e., greater than 95%) fail to survive for any significant length of time.
A scavenger is called a generation scavenger if it segregates objects into multiple generations and then concentrates on the younger generations where the deaths are more likely to occur and hence where the reclamation efforts are more likely to reap the greatest benefit for least effort. For the purposes of our scavenger, all objects fall into one of two generations--those objects that are housed in NewSpace and those objects that are housed elsewhere (i.e., in OldSpace or PermSpace). Those objects housed elsewhere die relatively infrequently, so scavenging is not necessarily the ideal reclamation system for such generations. As mentioned above, however, almost all of the objects in NewSpace die, so scavenging is quite appropriate for this generation.
Briefly, the current scavenger works as follows. Whenever Eden (object-creation space) is about to fill up, the scavenger is invoked and it locates all those objects in Eden and the occupied SurvivorSpace that are reachable from the system roots and then copies these objects to the unoccupied SurvivorSpace. Once this copying is done, Eden and the formerly occupied SurvivorSpace will contain only corpses; hence, they are effectively empty and can be re-used. The scavenger uses the objects in the RT and the objects referenced from the Smalltalk stacks as roots. In addition, the scavenger is designed to operate without disrupting the user. It attempts to remain non-disruptive by limiting its workload as follows: if the aggregate size of the objects that survive each scavenge begin to become so large as to slow down the scavenger's operation, then it starts to copy some of the oldest surviving objects to OldSpace. We call this 'tenuring' an object.

You can control the operation of the scavenger by manipulating the following thresholds (this class contains messages for reading and setting each of these thresholds):

edenUsedByteScavengeThreshold
When the aggregate size of the objects in Eden exceeds this number, the scavenger is invoked by the VM.

survUsedBytesTenuringThreshold
When the aggregate size of the objects in SurvivorSpace exceeds this number, then the scavenger will begin to tenure objects to OldSpace. It will tenure enough objects so that this threshold is no longer exceeded.

largeFreeBytesTenuringThreshold
Ordinarily, the headers of large byte-type objects (e.g., strings, uninterpreted bytes, and byte arrays) are housed in NewSpace and their data in LargeSpace. Thus, the scavenger only has to move the headers of such objects during a scavenge. Since the headers of such objects don't ordinarily consume much space, the system generally refuses to tenure these headers, thus allowing the scavenger to reclaim such objects when they die. However, the only way to free up space in LargeSpace is by moving the object bodies of large objects to OldSpace, and to do this we need to first tenure their headers to OldSpace. Accordingly, when the amount of free bytes in LargeSpace drops below the largeFreeBytesTenuringThreshold, then the scavenger will tenure enough large objects to drop below this threshold. This permits the large-object allocator to then move the bodies of such object out of LargeSpace and into OldSpace should it need to free up additional space.

Incremental Garbage Collector

The system also has an incremental garbage collector (IGC). Unlike the scavenger which only reclaims objects in NewSpace, the IGC only reclaims objects in OldSpace. It does so incrementally, recycling dead objects by placing their headers and their bodies on the appropriate threaded free list. The IGC can be made to stop if any kind of interrupt occurs, or it can be made to ignore all interrupts. In addition, you can specify the amount of work that you want the IGC to perform, both in terms of the number of objects scanned or the number of bytes scanned, and it will stop as soon as either condition is satisfied. The VM never invokes the IGC itself. Only Smalltalk code can run the IGC. A typical memory policy might be to run the IGC in the idle loop, in low-space conditions, and periodically in order to keep up with the OldSpace death rate.

The IGC has five distinct phases of operation:

1. Resting -- the IGC is idle.
2. Marking -- the IGC is marking live objects.
3. Nilling -- the IGC is nilling the slots of WeakArrays whose referents have expired.
4. Sweeping -- the IGC is sweeping the OT, placing dead objects on the threaded free lists.
5. Unmarking -- the IGC is unmarking objects as a result of the mark phase being aborted, either at the user's request or because the IGC ran out of memory to hold its mark stack.

The typical order of operation is: resting -> marking -> nilling -> sweeping -> resting. The unmarking phase is only entered if the mark phase is aborted, and it leaves the IGC in the resting phase when the unmarking is completed. Each of the above phases is performed incrementally; that is, each can be interrupted without losing any of the work performed prior to the interrupt. The IGC never performs more than one phase per invocation. This provision permits clients to specify different workloads and different interrupt policies for the different phases. Consequently, clients will need to wrap their calls to the IGC in a loop if they want it to complete all of the phases. There is protocol for doing this on the class-side of ObjectMemory.

Compacting Garbage Collector

The system also has a compacting garbage collector that runs atomically with respect to Smalltalk code. This garbage collector is a mark-and-sweep garbage collector that compacts both the OT and the corresponding object data. This garbage collector marks and sweeps every space in object memory except for PermSpace, whose objects are treated as roots for the purposes of this collector. If an object is not reclaimed by this garbage collector, then it is reachable from the either the system roots or from an object in PermSpace.

Global Garbage Collector

The system also has a global, compacting garbage collector. It too runs atomically with respect to Smalltalk code. This garbage collector is identical to the compacting garbage collector except that it marks and sweeps all of the memory that is managed by the VM, including NewSpace, LargeSpace, OldSpace, and PermSpace. If an object is not reclaimed by this garbage collector, then it is reachable from the system roots.

Data Compactor

The system also has an OldSpace data compactor. Because this facility does not try to compact the OT or to mark live objects, it runs considerably faster than either of the two garbage collectors. It should be invoked when OldSpace data is overly fragmented.