Does my NodeJS application leak memory? – 4

The heap, garbage collection and scavenges

In the last post I discussed the heap, dead and live objects and the object graph. Recapping…

Most of the allocations required by an application happen on the heap. The stack only contains SMIs (immediate 31-bit integers). This could be data or pointer to data on the heap. As functions execute, V8 cleans up the dead memory regions that have been abandoned by the stack. This cleanup event(A.K.A garbage collection) is triggered at intervals determined by V8. One such point is when V8 detects that it will soon run out of heap memory. The V8 subsystem that performs garbage collection is called the garbage collector. V8, like most JavaScript engines has a built in “garbage collector”.

V8 uses an extremely simple technique for garbage collection. In this it scans the stack for any references(handles) which are active(live) and considers the referenced objects live. The rest of the heap is considered garbage(dead) and is marked to be reclaimed for re-use.

One of the problems we face is that V8 is single threaded. When garbage collection happens the rest of the program stops. This is why it is called “stop the world” garbage collector. This has performance implications because if the execution of the program is held up for long it may result in sluggish behavior. From a UI standpoint this may result in browser rendering to be less than 60 fps resulting in the user experience being less than optimal. More information on this sluggishness in browser rendering, also called “Jank” can be found here.

Collection types and cycles

If you think about it, most of the allocations we request are localized to the executing function. In comparison the allocations that require to be alive across multiple function calls(e.g. globals) are fewer. For this reason V8 uses a “Generational” garbage collection system where the lifetime of the objects determine their placement in the heap. In this the heap is divided into two major lifetime sections or generations. The new and the old generation(or space). The new space holds objects that are short lived while the old space holds objects that are intended to be around for a longer period of time.

This is how it works. Most objects(if they are not too big) start their life in the new space(young generation). When memory is required, V8 quickly scans the new space for any live objects and considers the rest of the objects as dead and re-usable. This quick scan and cleanup is called a scavenge(minor collection cycle) and usually lasts less than a millisecond.

If an object survives two scavenge cycles it is considered “old” and moved(promoted) to the old space. At some point when V8 determines that it may need more memory in the old space it triggers a major garbage collection cycle(mark-sweep and mark compact) and removes dead objects from the old space.

One consequence of garbage collection is memory fragmentation. Dead and live objects are intermixed in the heap. If there are many of these pockets of free memory(space occupied by dead objects it would make allocation of new memory slow as objects would have to be split  up for storage. To fix this the garbage collector moves objects around and lays them out in a contiguous memory space. This process is called “memory compaction”.

Spaces in the Heap

To understand both the minor(scavenge) and major(mark-sweep and mark-compact) cycle, it is helpful to visualize how the heap is divided in to spaces and their functions. The heap is divided into multiple sections or spaces. The spaces that are relevant to us are shown in Figure 4.1. below which provides a pictorial representation.

Spaces in the heap

Briefly, here is what each space does:

  • New space: Short lived objects allocated here. Scavenge works in this space
  • Old Space = Old data space + Old pointer space: Promoted objects from new space and other raw data objects and pointers respectively. Mark-sweep and mark-compact work in this space.
  • Code space: Executable instructions
  • Other miscellaneous spaces: These include the Cell space, property cell space and map space and they contain specialized data and pointers. No garbage collection takes place here.

Within  these spaces we will concern ourselves only with the new space and the old space. Let’s look at scavenges and the new space first.

Scavenges

Consider this simple example(Figure 4.2) in which a string of 100,000 bytes  in length is created and stored in the local variable text. The setInterval timer runs the aString function every 200ms till the program is aborted .

Figure 4.2

It is clear that there should not be any memory growth, because every time the function returns the variable “text” goes out of scope and is discarded. The program does not have any practical application but works well for learning about scavenges. Running this piece of code with the trace_gc option in the terminal we get the output in figure 4.3.

Log

The dissected trace(the one in bold) is shown below in figure 4.4.

 

Trace Dissected

This is the simplest of the gc traces that V8 offers. It gives an excellent overview of memory allocation. This is also the first trace I run to check on how memory allocation is doing.  The explanations given in the figure for the dissected trace are self explanatory.

Analyzing the above trace:

  • It is a scavenge event, triggered as a result of allocation failure. In other words V8 ran out of memory in the new space and decided to reclaim dead space.
  • Each gc event is less then 1.0 ms. This is ok. If this time tends to increase then there is a possibility of a leak. It means that V8 is finding it hard to gc.
  • The difference in size of before and after objects which is about 1MB remains fairly constant.
  • The before and after overall memory size(41.1MB) remains fairly constant. This indicates that all of the dead space is being cleaned up. If it did not, then this would be another indication of a leak. In real applications it is not this ideal and some variation should be expected.

Continuing with the analysis let us assume that the new space is 1MB in size and that it is only used for text. Visualizing at a very simplistic level:

  1. For a 1MB sized new space the start and end of new space in the heap would be between 0 and 1048575 bytes
  2. Initially the pointer is at 0th byte.
  3. Allocation is requested for 100,000 bytes
  4. The pointer moves 100,000 bytes from 0 to 99999 bytes. The pointer is at the 100000th  position waiting for the next allocation.
  5. After the 10th allocation the pointer is at 1000000th position
  6. For the 11th allocation there is no memory left so a GC is triggered
  7. When allocation is requested for the text variable (11th allocation) V8 finds that there is no memory left in the new space.
  8. V8 reaches out to the stack and finds that there are no live objects. This means all of the new space contains dead objects and can be re-used.

Please keep in mind that V8 uses a variety of techniques to trigger a GC. Also, it tends to manipulate the size of new space depending on current conditions. This is the reason why the trigger points are never the same. It is the trend that is important and not the exactness of the values.

V8’s scavenge is based on Cheney’s algorithm. In this, the new space is divided into two semi-spaces, the to-space and from-space. There are five steps  that the garbage collector goes through to GC.

  1. Memory allocation starts in the to-space
  2. Once the to-space is out of memory a GC event is triggered
  3. To and from spaces are swapped out. At this point to-space is empty and from-space is full.
  4. Live objects are copied from from-space to to-space, laying them next to each other in contiguous memory locations. This also compacts the memory. from-space now only contains dead objects while to-space contains only live objects.

Note: Live objects are in blue, while the dead objects are in gray.

from-space is now fully reclaimed and can be reused and the cycle continues. For a pictorial representation see figure 4.5 below.

 

Cheyney's Alogorithm

We used a simple example to displayed the V8 GC trace and further dissected it to see how scavenges work. No special tools were used. As such, you can start using it with your existing projects. I will talk about Mark sweep and mark compact next time.

References:

  1. A tour of V8: Garbage Collection (An excellent series of blog posts on GC by Jay Conrod)
  2. An excellent collection of references on V8 performance by Thorsten Lorenz

Does my NodeJS application leak memory? – 3

The heap, objects dead or live?

In the last post I discussed the stack. To quickly recap, the stack is LIFO which is fast and managed automatically. It is small in size and stores only local variables(immediate small integers). Everything else is stored on the heap.

The heap, compared to the stack is bigger in size, more freeform in nature and stores reference types such as objects and strings. Variables that have to span function calls, including globals and variables captured by closures are also stored here.

The heap is dynamically allocated by the OS. It is self managed, that is, the running program(in this case V8)  makes requests from the OS for allocation and de-allocation. It is divided into multiple sections or spaces. The spaces that are relevant to us are the New Space, Old space, Code space, Map Space, Large object space. More on this when I talk about garbage collection. However for now it is extremely important to understand how the stack and the heap interact when a function or an application is executing.

A heap can be imagined as a network of interconnected objects. Consider the example we used in the previous post drawn a little differently.

leaks-post.010
Figure 3.1

In the above figure(3.1):

  1. aSmallInt is a SMI (immediate 31-bit integer) stored on the stack.
  2. aFloat is a number object so a reference(a handle in V8 speak) is stored on the stack with the number object on the heap.
  3. anObject is a literal object. Reference to this object is stored on the stack while the object itself is stored on the heap. The object is split up into three other string objects one each for a country code.

We can draw this in a different way:

leaks-post.011
Figure 3.2

This is a very simplistic rendering of an object graph. The local handles (or references) point to the objects on the heap. The fact that the heap is a network of interconnected objects becomes clear with an object graph.

As long as the parameters of the function test exists on the stack, its handles(references) exist and so do the objects on the heap. The objects in the heap are called live objects. Once the function ends  (returns) both the handles go out of scope and the objects in the heap are now considered dead. V8 can now reclaim the memory area used up by these objects. Understanding  dead and live objects is important for visualizing how the code affects garbage collection.

Objects, Dead or Live?

An object is considered live if it is being referenced by some chain of pointers to a root object or another live object. I will discuss two examples to illustrate this.

Simple Example

Consider the following simple snippet. The function  getCountryCode returns the countryCode if one is found or otherwise an empty string:

var countryCodes = {no: "+47", us: "+1", uk: "+44"};
var cc = '';

function getCountryCode(countryAbbreviation) {
    var ret = null;

    if (countryAbbreviation in countryCodes) {
       ret = countryCodes[countryAbbreviation];
    }

    return ret;
}

cc = getCountryCode('no');

 

In this code snippet:

  • The literal object countryCodes is global.
  • cc, the variable which will hold the results is initialized with an empty string and is a global.
  • countryAbbreviation is a local variable and a string.
  • ret, the return variable is also a local variable and a string.

1. Before getCountryCode is run the graph looks like in the following figure(3.4).  Note that both the global variables, cc and countryCodes are stored in the heap pointed to by global handles.

 

leaks-post.013
Figure 3.4

 

2. Before the functions ends, that is before getCountryCode returns, here is how the object graph looks (figure 3.5). Now there are two local variables. Both are string objects and stored on the heap pointed to by local handles.

leaks-post.014
Figure 3.5

 

 

3. Once the function ends(returns) the graph looks like in figure 3.6. As the function has ended, both ret and countryAbbreviation string objects have lost their references and now can be cleaned up by the garbage collector as these objects are dead. On the other hand both the global variables cc and countryCodes remain alive and the garbage collector will not touch them.

leaks-post.015
Figure 3.6

 

In summary:

  1. On entering the module, countryCodes object and cc which are global variables are allocated space on the heap. Both the globals are considered as root objects. The stack is empty at this point.
  2. On entering getCountryCode, ret and countryAbbreviation, both local variables (not small integers) are stored on the heap with the references on the stack. Both the local variables are considered root objects. They are also live objects as they can be accessed within the function scope. In other words they are live for the entire execution of the function.
  3. When the function returns, the global variable cc contains the value of 47, the country code for Norway.
  4. Both ret and countryAbbreviation go out of scope, are popped from the stack and discarded.
  5. The references to the heap for ret and countryAbbreviation are removed. Both ret and countryAbbreviation in the heap are considered dead.
  6. Both globals are still live as they can be used again till the program terminates.

 

Example with a Closure

Now consider the same example but in a form of a closure in which processCountryCodes returns a reference to the inner function getCountryCode. Please note that the variable getCountryCode to which the function is assigned is redundant and used here for clarity. I could have easily returned the function itself directly.

var cc, code;

function processCountryCodes() {
    var countryCodes = {no: "+47", usa: "+1", uk: "+44"};
    var getCountryCode = function (countryAbbreviation) {
        var ret = null;

        if (countryAbbreviation in countryCodes) {
            ret = countryCodes[countryAbbreviation];
        }
        return ret;
    };

    return getCountryCode;
}

cc = processCountryCodes();
code = cc('no');
console.log(code);//+47

1. Before the function processCountryCodes is run the object graph looks like in the figure(3.8) below. Both global variables are allocated space on the heap. They are referenced via global handles. Simple enough.

Figure 3.8
Figure 3.8

 

2. Before the function processCountryCodes returns the object graph looks like in the following figure 3.9:

Figure 3.9
Figure 3.9

The inner function getCountryCode is a closure and closes over the object countryCodes. In other words before processCountryCodes returns it needs to save a reference to the countryCodes object. This is because once processCountryCodes returns, the stack will be wiped clean. To retain the reference V8 automatically creates an internal object called the “Context object” which is an instance of the JSFunction class and adds the countryCodes object to it as a property. For the same reason V8 also allocates space for the inner function in the heap. Please remember that V8 creates the context object when it enters the outer function processCountryCodes.

3. Before the function processCountryCodes returns the object graph looks like in the figure 3.10. The global variable cc now holds a reference to the inner function getCountryCode.

Figure 3.10
Figure 3.10

 

4. On execution of the inner function the graph looks like in the figure 3.11 The global variable cc now holds a reference to the inner function getCountryCode. Global variable code now contains the result “+47”.

Figure 3.11
Figure 3.11

 

Please note that I have not included the objects for the inner function because that is the same as the earlier example.

In summary:

  1. On entering the module, cc and code which are globals are allocated space on the heap. The stack is empty at this point. cc and code are now considered live root objects.
  2. On entering processCountryCodes V8 builds a special “Context” object on the heap and adds the countryCodes object as a property to it.
  3. Before processCountryCodes returns, the local variable now holds a reference to the inner function which is allocated space on the heap
  4. Once processCountryCodes returns, global variable cc holds the reference to the inner function.
  5. Global variable holds the result “+47” once the inner function is executed via the global variable cc.

The key point here to note is that even though both processCountryCodes and getCountryCode have been executed the heap structure remains intact. The reason is that both the global variables will keep holding references to the objects in the heap till the program terminates.

I hope that you now have the necessary tools to visualize your code in terms of an object graph. In the next post I will talk about Garbage collection as it relates to the heap and build on the information in this post.

 

Done is better than Perfect

Earlier in the week I attended a NodeJS meet-up at Hacker Dojo. Outside the event room was a sign “Done is better than perfect”.

In May of 2015  the Amtrak Northeast Regional train from Washington, D.C. bound for New York City derailed resulting in 8 deaths. Investigations revealed that this particular route did not have the advanced safety technology (“positive train control” (PTC)) meant to prevent high-speed derailments. Under current law, the rail industry must adopt the technology by the end of this year.

Last month, 6 Irish students died in Berkeley when a balcony collapsed. One of the surviving students will never walk again. Mayor Tom Bates said “investigators believe the wood was not sealed properly at the time of construction and was damaged by moisture as a result. He said that appears to be the prime cause of Tuesday’s tragedy.”

The story is always the same.  Investigation after a horrific accident reveals that though the service was completed (done) it was not done “right”. Good practices and established processes were not followed. Just enough was done to pass an inspection or get a permit.

The Shinkansen (A.K.A Japan’s bullet train) was introduced in 1964. Over its 50-year history, carrying nearly 10 billion passengers, and traveling at speeds ranging from 120mph to 180mph, there have been no passenger fatalities due to derailments or collisions, despite frequent earthquakes and typhoons. As a young engineer on training it was a treat for me to travel on the Shinkansen from Osaka to Tokyo. I was amazed at the smoothness of the ride and marveled at the engineering feat that made it so. The Shinkansen was not built to perfection, it was just built right!

In the case of Bookout v. Toyota, Toyota was found guilty of a UA (unattended acceleration) death. It was due to a software bug which was a result of not following the established development processes. Michael Dunn in this EDN article titled “Toyota’s killer firmware: Bad design and its consequences”  provides an exhaustive explanation of the consequences of winging it. When the feds reached a settlement of $1.2B Eric Holder, the U.S. Attorney General said “When car owners get behind the wheel, they have a right to expect that their vehicle is safe. If any part of the automobile turns out to have safety issues, the car company has a duty to be upfront about them, to fix them quickly, and to immediately tell the truth about the problem and its scope. Toyota violated that basic compact.”

This same concept applies to the software we write. When a user uses our software they have a basic expectation that it will function as promised. That it will function under load, in different network conditions and if it involves critical systems it will have built in redundancies.

In the everyday world where software apps govern our daily lives, the phrase “Done is better than Perfect” has a very simple definition. Build small, build well, ship often. “Build well” means do it right, follow good development and test processes, and aim for proper working functionality.

Dave Rooney in his post puts it well, “Done” and “perfect” refer to the scope of the functionality, not its quality.”

Mark Zuckerberg in his letter to his shareholders about the “hacker culture” before the IPO wrote: “…Hackers try to build the best services over the long term by quickly releasing and learning from smaller iterations rather than trying to get everything right all at once. To support this, we have built a testing framework that at any given time can try out thousands of versions of Facebook. We have the words “Done is better than perfect” painted on our walls to remind ourselves to always keep shipping.” It was just about doing smaller iterations but every iteration done right. The iteration should work like the user expects.

Aileron in their post put it aptly,  “A corollary to “good is better than perfect” is “done is better than perfect”. This does not mean doing a mediocre job or producing a poor-quality product. Being done and finishing a task gives every business person an outcome. This result will give them important success or failure metrics to decide what their next step will be.”

Done right is better than perfect!