Oct 6th 1999.

Old Space a place to die?


/* Parts copyright John M McIntosh 1996. All rights reserved.

Old Space memory management in VisualWorks is an important key to managing overall image growth and dynamics.

In my last article, "new: Where's the Beef" The Smalltalk Report, Vol. #, No. #. Feb. 1997 I touched on the idleLoopProcess and how it attempts to clean up dead objects in Old Space. I was profiling an image that acted as a server when I was lead to write this series of articles. Within this image a large number of objects were created, living just long enough to be moved to Old Space - where they promptly died! This situation is a classic case of the 'Early tenuring problem' which was discussed in detail in my first article "Controlling VisualWorks NewSpace" The Smalltalk Report, Vol. #, No. #. Dec. 1996. The effect of this situation on the image was that it grew to 18MB before reaching a steady state even though an explicit garbage collection (GC) would report 8MB were free. I ran into this problem as a result of database and network activity keeping short lived objects alive for many seconds. An evil side effect of Client/Server processing? Enlarging NewSpace as suggested in my first article didn't quite solve the problem. Many of these objects lived a long time, thereby making the working set too large. It is necessary to delve deeper into the problem!... Once an object is tenured into Old Space, how and when does the image look for corpses?

The concept behind the generation scavenging garbage collector used by VW lies in the fact that most objects die shortly after they are born. The remaining 5 odd percent of objects that survive beyond infancy are deemed to live forever...certainly trillions of nanoseconds! Once these objects are identified, they are tenured into Old Space. However, it is necessary to look into Old Space for corpses in order to keep image growth under control, since placement of an object into Old Space doesn't guarantee the object will in fact live forever. Unlike the NewSpace GC which runs hidden within the VM, the Old Space GC is under your full control. Garbage Collection doesn't occur unless you ask for it. Yes, you as the developer have full control to dictate how often, how much or when this should happen! To run it on demand, you can invoke ObjectMemory quickGC to have the Incremental Garbage Collector (IGC) do a full cycle to clean up dead objects. Alternately, ObjectMemory compactMemory can be used to do both a GC and compacting cycle which will take more time, but possibly give better results. The compacting cycle will move objects around in memory to condense free space in order to maximize the size of free blocks, and minimize the number of total free blocks. ParcPlace lays a foundation for you to ensure the Old Space GC runs from time to time.

The foundation supplied by ParcPlace avoids your explicit involvement by providing a low priority process, the idle loop process, which is referenced by ObjectMemory's class variable IdleLoopProcess. This process occasionally runs the IGC based on allocation activity, and on various magic numbers and algorithms. As mentioned in my previous article, this activity is tied to the VM scavenge event. You must remember however, that this is a low priority task! Other higher priority tasks, such as your domain model, windows etc., will run before this process does. This is where the trouble begins!

The IGC is a special GC that runs in discrete steps, or increments, spreading GC work among the idle points within more important application work. It uses a classic mark/ sweep algorithm to perform the GC. Some background information on mark/sweep theory can be found in Kent Beck's. 'Garbage Collection Revealed', The Smalltalk Report, Vol. 4, No. 5. Feb. 1995. This article talks about VisualSmalltalk, and gives the reader a detailed introduction into GC theories. As the IGC runs, it has five distinct phases of operations that you can monitor via ObjectMemory current incrementalGCState.

#resting Idle (not active)
#marking Marking live objects. Starting from the roots of the world, each accessible object is marked as living.
#nilling Zero slots of WeakArrays who's references have been GCed.
#sweeping Looking for unmarked objects, these are considered dead and placed on the Old Space free lists.
#aborting Aborting the mark/sweep cycle. The user asked to abort the cycle or memory got too low during a mark/sweep cycle to continue

The VisualWorks IGC is very powerful. It can be invoked in phases, be asked to run until interrupted, or work only until certain goals based in bytes and objects examined are met. With these features, one can control the effort dedicated to a mark/sweep cycle.

After reading the comments in VW and considering the problem I had with my server application, I wondered why there was still a problem with dead objects collecting in Old Space. Shouldn't the IGC clean it up? Obviously more understanding was required to understand why the IGC didn't work quite right.

My original problem application had five processes doing polling work. I realized that they in fact CPU starved the idleLoopProcess, preventing it from doing anything. As a result, dead objects started to collect in the corners faster than the CPU starved idleLoopProcess could sweep them up. Once this happens, the image runs low on free memory, and another memory management feature kicks in.

LowSpaceAction

As mentioned last month, VW trys to avoid serious GC work when allocating storage for an object. The responsibility for doing this is shared between the low priority idleLoopProcess and a high priority process called the lowSpaceProcess. The low space process waits on the ObjectMemory class variable LowSpaceSemaphore, which is signaled by the VM when the amount of free memory drops below certain values. A quick look at a minimal image shows a high value of 960K, and a critical low value of about 244K. These values change however, based on the application dynamics and memory usage. The lowSpaceProcess, once started, invokes the current MemoryPolicy>>lowSpaceAction, which then performs a number of actions attempting to fix the low memory problem.

First we see if free space is abundant. There must be at least 244K free (ObjectMemory class variable HardLowSpaceLimit), and the largest contiguous free memory block must exceed 280K (ObjectMemory's instance variable reservedContinguousFreeBytes). If this criteria is met, then MemoryPolicy>>incrementalReclamation is invoked to run the IGC in phases to look for a number of dead objects based on heuristic calculations, and the LowSpaceAction processing is complete. The amount of effort expended on looking for dead objects is a function related to a key MemoryPolicy instance variable called incGCAccelerationFactor. This will be explained later.

If free space is too low, MemoryPolicy>>favorGrowthOverReclamation is invoked. This method is used to decide if image growth should occur. The default logic as shipped by ParcPlace is to grow the image to at least the size indicated by MemoryPolicy growthRegimeUpperBound (15.3MB) before doing any serious garbage collection activity. As long as this limit is not reached, the image will grow. Growth is cheap as long as you do not start paging. Freedom of growth can be restricted by altering growthRegimeUpperBound to indicate where you would like more aggressive GC processing to occur. Altering this value may curb your application's appetite for monopolistic image growth, assuming you need to share your machine with other applications.

After a possible image growth free space is checked against 293K, the sum of ObjectMemory's class variable HardLowSpaceLimit and MemoryPolicy's instance variable availableSpaceSafetyMargin. If space is below this value, or if contiguous free space is below 49K, then a full IGC cycle is forced to free up space, along with possibly running the compacting GC. Needless to say, these actions are expensive and time consuming to run! At this point, you get to see all the GC cursors, and possibly pound on your page space volume.

If space is still low after all this work, a critical point is reached. Our back is against the wall! We will ignore our recommended growth policy, and attempt to force an image growth. However, if previous growth requests have exceeded MemoryPolicy's instance variable memoryUpperBound, the growth request is denied! The default MemoryPolicy memoryUpperBound is set to SmallInteger>>maxVal which is 536,870,911 bytes (229-1). I have heard that altering this value to a larger number is a quick way to a VM crash! My testing under windows NT shows that you can change it higher, but it doesn't affect the amount of memory the VM reserves with NT at startup. If you want to be a good citizen and restrict growth before bumping into your page space limit, you can set this instance variable to a smaller amount. This will cap aggressive image growth.

Finally, if space is still too low, and there still is enough free space on hand for a notifier window, the user may get either the 'Space warning 1234 bytes left: ' or the 'Emergency: No Space left' notifier window. I can't say I've ever seen the Space warning message on machines today. You hit the brick wall way too quick! Emergency No Space , or other obscure message from the VM is always my first indication of a memory problem. As a rule, an image with a runaway memory allocation problem will grow until you either hit the memoryUpperBound limit, fill your OS's entire page space, or until you can no longer take wait for control to return.

The incGCAccelerationFactor

This instance variable is used to moderate how much work the IGC should do to find dead objects in Old Space when we run low on memory and are making GC work a high priority objective. The default value is 5. Altering it to a higher number makes the IGC work harder. This in turn impacts response time since we might find more dead objects during the IGC cycle. You can alter the value to a very high number, but the end calculations are limited by other upper bound constants, which would in turn need to be raised, these are found in MemoryPolicy methods maxBytesToNil, maxObjectsToMark, maxObjectsToSweep, maxObjectsToUnmark

A 1990 copy of ObjectWorks, the precursor of VisualWorks, contained almost the same low space logic. However, the amount of computing power, memory, and application demands have increased dramatically over the past six years. On talking to some of the senior folks at ParcPlace-Digitalk, I learned that most of the GC performance objectives were set based on machines built in the late 1980's. Much of this logic is still found in VW today. Given the extraordinary advances in CPU performance, I'd expect one can crank up the acceleration factor without noticing any great impact on application response times. With this in mind, let us examine a test case:

Using the same EarlyTenureTest class that I created for my first article in this series, we will use it again to create thousands of 750 byte objects and continue doing this for 60 seconds. This problem class causes the spillage of objects into Old Space where they promptly die... The 'early tenure problem'! Since our domain model is running flat out, the idleLoopProcess cannot do incremental GC work because our user priority process CPU starves the lower priority IGC process. Because the idleLoopProcess cannot do the GC, this falls to the LowSpaceProcess, which attempts some GC work before taking the default action of allocating more Old Space segments for the image. Earlier I pointed out that the amount of work the IGC does is altered by incGCAccelerationFactor. I expect if we increase this variable more GC work will occur, but my application will run slower! To test this, I tried a number of acceleration factors and graphed them as shown in Fig A.

In our example the image starts at 6.8 MB and grows to 11.6MB as it completes 10 units of work. Nearly a 5MB growth using the default acceleration factor. As the incGCAccelerationFactor is increased from 5 to 20 in steps of 5 growth is reduced to 8.7MB. A savings of almost 3MB. The cost for doing this is that our application runs 7 percent slower, 10 units of work slows to about 9.3 units. Not a large performance penalty, given the memory savings! As I suspected less memory used, but more CPU time dedicated to GC work, we trade memory for CPU cycles. If a reduction in 3MB of memory solved a paging problem this would be a good trade off.

Finally, back to my server image. A viable solution for my application was to call the IGC once during a transaction processing cycle. In my case this was an easy solution, since I had a repetitive transaction logic path that was followed again and again, and I could easily insert an GC call at an appropriate point in the logic. But this is not a good solution, since I am worrying about GC issues in my application. An automatic garbage collector should worry about this problem, not me! Digging deeper, we see that the behavior of LowSpaceProcess can be changed to ensure more GC work is done before image growth needs to occur with a minor tradeoff of application performance. This is an acceptable solution in order to prevent excessive image growth. It enables us to avoid reaching an image size where paging occurs... which would in turn greatly impact the performance of the application. Although garbage collection is automatic, you do need to coax it a bit, or at least examine the assumptions made by the vendor.

Final Thoughts: PermSpace

There are many objects in a Smalltalk image that will live forever. Should they live in OldSpace? That would result in having to look at them every time a GC cycle is invoked, and having to move them around during a compaction. This would be expensive as well as a waste of resources. Generational GC theory applies once again... Provide another space (PermSpace) to drop these permanent objects into. PermSpace is similar to OldSpace. However, no GC cycle is done in PermSpace unless you explicitly request it. Promoting objects into PermSpace is also an explicit task, and PermSpace does not grow. Promoting objects is done by doing a 'Perm save as' command which tells the VM to move all objects from OldSpace to PermSpace when the image is re-started. By doing this, you migrate all objects that live forever within your image into an area where the IGC won't need to look at them again. For example a new VisualWorks V2.5 image has 4.4MB of objects contained within PermSpace and about 110K of objects in OldSpace.

I hope I have solved some, if not all, of the mysteries of VisualWorks' garbage collection. This is the last article of my series on this subject. The task of doing Old Space GC work is complex, but not difficult to unravel! Once you know how it behaves, you can better understand how your image grows, as it balances GC work with performance objectives.