 |
|
|
Oct 18 1999. |
Barry Hayes Scavenger Notes
|
Date: Tuesday, 28 January 1997 11:48:56
From: eliot@parcplace.com
Subject: Re: FixedSpace and Others
Hi John,
I'm forwarding your message to Barry Hayes who's our resident
GC guru and the actual designer and implementor of FixedSpace.
I'm sure he can help better than I can.
John --
Feel free to ask me any more you need to know ..
A scavenge happens, usually, when Eden is nearly full -- there's
some buffering room -- and copies Eden and the full semi-space
to the empty semi-space. The idea is that most of the objects
in Eden will be garbage, so the target space of the scavenge
doesn't have to be as big as the source space. Thus, source is
Eden+live semi-space, target is empty semi-space.
Some volume of objects in the source semi-space is targeted for
tenure to oldSpace at the start. If the guess about how much
of Eden would survive is wrong and the target semi-space fills
up, the scavenge continues, but the rest of the survivors go
into oldSpace. The guess about how MUCH to tenure is based on
how full the space is. The target of exactly which objects to
tenure is based on their addresses -- objects below a certain
address are tenured. [In addition, objects with large bodies
are also tenured quickly -- I assume that
someone found that they have higher than usual survival times,
and copying the header was a waste of effort. In addition, the
scavenge may be called with the explicit GOAL of tenuring certain
objects -- for example, objects that are involved in a becomes:
that can't cope with newSpace objects. Real low-level technical
stuff ..]
The roots of a scavenge are the RT and the stacks. Maybe a few
strange globals as well, but no traversal of Old/Perm space.
Truth is, I'm a bit shakey about where the mark/sweep starts
as well. But the difference between a collect and a collect-all
is just that the oldRT is used is part of the root-set, and permSpace
isn't examined. So anyway, the roots are (1) the oldRT, (2) the
stacks. If this is a stop-the-world collection, like you get
from the pulldown menu, it's all pretty easy. The world is stopped,
the mark marks objects [including newSpace objects] but only
frees objects in oldSpace [or old and perm for a full GC]. The
objects in newSpace are dead, but the GC doesn't want to step
on the scavenger's toes -- the next scavenge will free those
objects.
Ah yes, it's starting to come back to me .. The incGC is tougher.
While all the other threads are running, and scavenges are going,
a background process is trying to find free memory by taking
limited slices of time. This starts at the oldRT and stacks.
The scavenger aids the incGC, as part of its job, by making sure
that when it promotes an object that is pointed to by a marked
object in the RT, that object is marked. Whew. If it promotes
an object that is pointed to from the RT but the old-space object
isn't marked, then either the incGC hasn't marked it yet, and
will now just trace through the promoted object, or the object
in the RT isn't really reachable, and the promoted object will
get collected in the sweep ..
[And another aside, but I hope you can handle it .. aggressively
tenuring the stuff at low addresses may be the WRONG thing to
do. The first part of the scavenge transitively traces the marked
objects in the RT. I assume that an object in newSpace pointed
to from oldSpace will probably survive. Just gut instinct. Promoting
THOSE objects is good. Then it lays down objects reachable from
the stacks and unmarked RT and then does the transitive closure
copying thing. Promoting THOSE objects may well be bad. IE, if
you're not in an incGC, the first objects that get tenured are
those reachable directly from the stack, maybe the newest of
the new. And they'll drag a lot of stuff over the border with
them in future scavenges .. Anyway. Digression, and more than
you probably
need ..]
You're right about fixed space. Objects bodies live in fixed
space, not headers. The headers are in the same new/old/perm
progression as any other object. When the header is collected,
the body is collected.
You say:
However I thought the Eden GC was generational stop and copy
to Old (this being the active semispace). The semispaces use
a concurrent copying GC (Bakers?) and tenure to OldSpace. (This
being a guess on my part, but I'd rather not sit in front of
300 peope and look silly).
No. When Eden is full, Eden and the "from"
space are copied to the "to" space together, with no
distinction made between Eden and fromSpace.
As to WHY it's done that way, I can be less definite. The newest
thing in Eden will almost CERTAINLY be alive. You're scavenging
because you just MADE it, after all. I suppose that the reason
you don't just copy to fromSpace is that it's mostly full already.
And besides, if you did that you'd need to find all the pointers
to Eden from fromSpace, and that would be another painful RT.
I don't know what you mean by "Eden GC". There are
three things that are collecting garbage. (1) The generational
scavenger, which does a two-space-plus-Eden Baker-style stop-and-copy.
(2) The stop-the-world mark-sweep compacting collector, which
collects old or old+perm, invoked from the pulldown or in fairly
nasty out-of-memory panics. (3) The incremental marking collector,
(which does no compaction) and collects objects with headers
in oldSpace (and bodies in old, large, fixed, and rarely permSpace).
Call these "The scavenger" "The compacting collector"
and "the incremental collector".
The lifetimes of large objects are going to depend on what you
are benchmarking. Well, THAT's not saying much. For example,
if window bitmaps are in Smalltalk, they might have fairly long
lives. That was the idea, I think, behind the way things work
in VW. I suspect that Frank had his mind changed and never got
around to doing anything about it in the code. He's mentioned
ripping out large space, in fact.
The Scavenger does adaptive tenuring ala Dave and Frank. And
as I said, it doesn't do it by AGE, but rather by location in
memory which is at best random, and as I said, perhaps worse.
In the VM, the incremental GC runs until it has done a bit of
stuff. This is defined as a certain about of marking or sweeping,
or whatever. The primitive to the VM is "Do some smallish
amount of stuff and return". I don't understand how the
VI wraps this, and I don't know what kind of interrupts you are
asking about.
--- you ask ---
So I assume that by scavenging from eden to the empty semispace
you put objects in order of allocation since these NEW objects
go at the top of the semispace and then when you copy objects
from the other semispace they've survivied at least one scavenge
event so if room is short you tenure stuff you can't fit to OldSpace
which then possibly might have1+N scavenges?
-------------
The scavenge is a stop-the-world two space
copy with an Eden space, not incremental or concurrent. It is
designed to be fast enough to not require a long pause.
New object in eden are in creation order. But many people have
pointed out that two-space copying is breadth-first traversal,
and that is the order that things get into after the copy, not
creation order. Trace down what happens if from-space contains
just two linked lists.
Again, the objects end up in breadth-first order, but it seems
as if there's another question in there as well, but I can't
quite figure it out ..
Oh yes yes. THAT interrupt. Sorry -- you were clear, I was foggy.
Well, as I said, the incGC does a small amount of work at each
call. In theory, small enough to be within the "acceptable
pause" parameters. But if there's a process with higher
priority than the incGC process, it will blow off that time-slice
and return, and will quickly get pre-empted. Otherwise, it would
take the full slice, and THEN do the switch. This would make
response to the process switch [assume it's something like an
I/O request] sluggish.
Slogging through the code, it looks like the idleLoop incGC allows
interrupts here the lowSpaceAction does not. So the lowSpaceAction
forces the incGC to make progress even if a higher priority process
becomes runnable part way through the time-slice, whereas the
idleLoop incGC calling will return early and let the other process
run.
|